Skip to content

[runtime] Binary section data loading for extra ELF images #6097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 31, 2017

Conversation

spevans
Copy link
Contributor

@spevans spevans commented Dec 6, 2016

Use static constructors for registering dlopen-ed images with the runtime.

  • For ELF targets, add ImageInspectionInit.cpp that wil be prepended
    to each ELF shared object to automatically add the section data
    as the object is dynamically loaded (rdar://problem/19045112)

@spevans
Copy link
Contributor Author

spevans commented Dec 6, 2016

@jckarter couple of notes:

I had another go at adding a struct SectionInfo into the swift_section.S but this time I added it as a new symbol and didnt change the existing ones. This meant I could get rid of the getSectionInfo(). If this isnt ideal I have another version keeping the old method

I couldnt work out what would happen if a 2nd thread dlopend an image while dl_iterate_phdr() was in progress so I just removed all of the dlopen()/dl_iterate_phdr() and just used the constructors. This should have the advantage of eliminating lots of callbacks for images we arent interested in and also saves on dlsym() lookups

@jckarter jckarter self-assigned this Dec 6, 2016
@jrose-apple
Copy link
Contributor

:-( We're trying very hard to avoid static constructors. I don't think this is worth the tradeoff if a particular program never asks for conformances.

@jckarter
Copy link
Contributor

jckarter commented Dec 6, 2016

Static constructors are the only public interface Linux and Windows' dynamic loader have for load-time hooks, so I think it's fine in this case, though we should be careful not to do any more work than absolutely necessary to register the new image with the runtime.

@jckarter
Copy link
Contributor

jckarter commented Dec 6, 2016

(At the ntdll level, Windows does have LdrRegisterDllNotification, but that's not documented or guaranteed to be supported in future versions.)

void addImageProtocolConformanceBlock(const SectionInfo *block);
void addImageTypeMetadataRecordBlock(const SectionInfo *block);

#endif // defined(__ELF__) || defined(__ANDROID__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is worth changing. We're wasting space for pointers that can be trivially recovered.

loadSectionData() {
addImageProtocolConformanceBlock(&__swift2_protocol_conformances_block);
addImageTypeMetadataRecordBlock(&__swift2_type_metadata_block);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is doing too much work, since it will force initialization of the lookup caches even if they haven't been enabled yet, and is not robust ABI if libswiftCore.so changes and requires new per-image registration hooks in the future. It would be better for the runtime to export an entry point swift_registerLoadedImage(const void *image), and have the static constructor do nothing but call that with a handle to the loaded image, something like:

SWIFT_RUNTIME_EXPORT extern "C" void swift_registerLoadedImage(const void *dsoHandle);

__attribute__((constructor))
static void onLoad() {
  swift_registerLoadedImage(__dso_handle);
}

The runtime implementation of swift_registerLoadedImage should be responsible for extracting and registering the sections that parts of the runtime are interested in, and do so only if those parts have already been initialized. It should be as close to a no-op as possible if an image is loaded before any lookup tables are initialized.

We should also make sure that this static constructor is not linked into executables or into libswiftCore.so itself, since the executable and runtime by their nature have to have been loaded before any lookup caches are triggered.

@jckarter
Copy link
Contributor

jckarter commented Dec 8, 2016

Another approach that would maximize laziness and avoid the need for a static constructor, at the cost of making failed lookups a bit more expensive, would be to have the lookup caches fall back to re-iterating all loaded images using dl_iterate_phdr if a lookup doesn't get a cache hit after processing all currently-registered sections. I'm not sure how expensive the dl_iterate_phdr iteration is. The approach could be paired with a simplified static constructor that just sets a loadedNewImages flag in the runtime, which failed conformance lookup would check before re-iterating the images.

Alternatively, does the dynamic linker on Linux provide any generation counter or something else we could easily inspect to detect whether additional images had been dlopen-ed since the last time we populated the cache?

@spevans spevans force-pushed the pr_solib_constructor branch from 6c79d2b to 8c5f358 Compare December 12, 2016 12:28
@spevans
Copy link
Contributor Author

spevans commented Dec 12, 2016

@jckarter I updated to remove the SectionInfo that was added to the swift_sections.S

I also renamed some of the functions used by the static init construtor, they may have implied that the caches were being initialised at startup but the lazy initialisation is still being used. The static init construtor just saves the SectionInfo to a list and its not used until initializeProtocolConformanceLookup() or initializeTypeMetadataRecordLookup() is called as is currently the case.

This also has the advantage that we dont need to avoid linking the static construtor into libswiftCore.so

I had a look at glibc but I couldnt see any export of information about images loaded by dlopen. Also, calling dl_iterate_phdr is quite expensive as a lock is held for its entire duration to stop any other images being loaded and dlsym is also quite complex, so I think this current solution does the least work while also being sufficiently lazy.

@jckarter
Copy link
Contributor

This would still hardcode an exact set of operations into every dynamic library, which would prevent libswiftCore.so from being upgraded to a version that needs additional per-image hooks. I really think that the static constructor should be either a single call to a function provided by the runtime, so that the runtime has full control of what happens. Organization-wise, that implementation should be able to live entirely in ImageInspectionELF.cpp so that we don't need to export its details out of the runtime, beyond a single public entry point.

@spevans spevans force-pushed the pr_solib_constructor branch from 8c5f358 to 82627c2 Compare December 13, 2016 01:20
@spevans
Copy link
Contributor Author

spevans commented Dec 13, 2016

Ive simplified the static constructor to just pass an address in the DSO to addNewDSOImage in ImageInspectionELF.cpp. When the caches are later initialised this address can be used to find the image filename using dladdr and the section data loaded via dlopen/dlsym. This keeps all logic in ImageInspectionELF.cpp

}, nullptr);

lazyInitProtocolConformance(addr);
lazyInitTypeMetadataRecord(addr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code organization looks much better, thanks! Keep in mind though that __attribute__((constructor))s fire not only at dlopen time, but at process load time, so all this laziness will be wasted when addNewDSOImage gets invoked early at process start. We need to keep track of whether each cache has been initialized to avoid doing any work if it hasn't. Something like this:

static std::atomic<bool> didInitializeProtocolConformanceLookup;
void swift::initializeProtocolConformanceLookup() {
  // Set `didInitializeProtocolConformanceLookup` so that loading
  // future Swift images will register with the lookup cache.
  didInitializeProtocolConformanceLookup.store(true, std::memory_order_release);
  // Search the loaded dls...
  ...
}

extern "C"
void swift::swift_addNewDSOImage(const void *addr) {
  if (didInitializeProtocolConformanceLookup.load(std::memory_order_acquire)) {
    addImageProtocolConformanceBlockCallback(...get section from addr...);
  }
}

and a similar pattern for the type metadata lookup cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code actually relies on addNewDSOImage being called before either cache is initialised as it creates a Mutex and array for each cache. Then addNewDSOImage just stores an address in the DSO into each list. When a cache is initialised, the addresses in that cache's list are used to load the section data from the DSO, then the list is freed and its ptr set to nullptr. Further addNewDSOImage calls will then just load the section data for the cache rather than storing the address.

The per cache mutex is used in case a new DSO is loaded whilst the current DSOs are being processed.

Does this make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If that's the intended design, then swift_once here isn't giving any benefit over a global constructor, since the entry point will be called at load time in almost any real program. I still think we should avoiding doing any work until the caches are used; it delays start time and wastes effort in a program that never uses the lookup mechanisms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if everything Swift is statically linked into the main executable? Does swift_addNewDSOImage() get called? If it does not, does that break anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gparker If it is statically linked then addNewDSOImage() is a nop and the cache initialise functions just access the sections directly, see https://github.com/apple/swift/blob/master/stdlib/public/runtime/ImageInspectionStatic.cpp

@spevans spevans force-pushed the pr_solib_constructor branch from 82627c2 to a4bca6a Compare December 19, 2016 16:41
@slavapestov
Copy link
Contributor

@jckarter Can you take another look at this? Thanks!

@spevans spevans force-pushed the pr_solib_constructor branch 3 times, most recently from 4ed3713 to 81b7abd Compare January 23, 2017 17:33
@spevans
Copy link
Contributor Author

spevans commented Jan 23, 2017

@jckarter Ive had another look at this and used your suggestion for the constructor of just testing a 'lookup initialized' flag for each cache. This keeps addNewDSOImage() as lazy and minimal as possible.

I also fixed a bug with the dl_iterate_phdr() callback where it gets called for both the dynamic loader and the main executable with a filename of NULL so was loading the data in the executable twice.

Copy link
Contributor

@jckarter jckarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a few minor requests:

};

// Called by injected constructors when a dynamic library is loaded.
void addNewDSOImage(const void *addr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is exported from the runtime, we should give it an extern "C" name like swift_addNewDSOImage, so that the runtime ABI isn't dependent on the C++ ABI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #6820 it's spelled SWIFT_RUNTIME_EXPORT. That handles all of the mangling and visibility fuss required to provide a symbol that is called by outside code.

@@ -30,6 +30,18 @@ namespace swift {
void *symbolAddress;
};

#if defined(__ELF__) || defined(__ANDROID__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we separate this #if block out into a separate ImageInspectionELF.h header?

// Called from ImageInspectionInit
void
swift::addNewDSOImage(const void *addr) {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't have to exist at all in a static executable. I'd rather we get a link error if we accidentally try to link in the static constructor stub than quietly link in a useless static constructor.

// that addNewDSOImage() sees a consistent state. If it was set outside the
// dl_interate_phdr() call then it could result in images being missed or
// added twice.
inspectArgs->didInitializeLookup->store(true, std::memory_order_release);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's guaranteed that dlopen and dl_iterate_phdr hold the same lock, then the didInitializeLookup flag should be able to be nonatomic.

static SectionInfo getSectionInfo(const char *imageName,
const char *sectionName) {
SectionInfo sectionInfo = { 0, nullptr };
void *handle = dlopen(imageName, RTLD_LAZY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also pass RTLD_NOLOAD? The previous version did.

- For ELF targets, keep track of shared objects as they are
  dynamically loaded so that section data can be added to
  the protocol conformance and type metadata caches after
  initialisation (rdar://problem/19045112).
- Rename swift::addNewDSOImage() to swift_addNewDSOImage() and
  export using SWIFT_RUNTIME_EXPORT.

- Move ELF specific parts of ImageInspection.h into
  ImageInspectionELF.h.
- Create separate swift_begin.o/swift_end.o for lib/swift and
  lib/swift_static. The static swift_begin.o does not call
  swift_addNewDSOImage() at startup.

- Update ToolChains.cpp to use the correct swift_begin.o/swift_end.o
  files for the `-static-stdlib` and `-static-executable` options.
@spevans spevans force-pushed the pr_solib_constructor branch from 81b7abd to 5c993b4 Compare January 27, 2017 15:30
@spevans
Copy link
Contributor Author

spevans commented Jan 27, 2017

@jckarter I have fixed all the minor requests and rebased.

I removed the RTLD_NOLOAD as I dont think it is necessary. dlopen(3) says it is only used to test if a shared object is already resident which should always be the case by the time the dl_iterate_phdr() calls the callback.

@jckarter
Copy link
Contributor

@spevans What's the behavior if a different file is on disk at the same path from what was loaded if RTLD_NOLOAD isn't passed? Seems like a nice consistency check.

@spevans
Copy link
Contributor Author

spevans commented Jan 27, 2017

@jckarter If the file is already loaded then then the filesystem isnt examined. Since the filename comes from either dl_iterate_phdr() or dladdr() then it should already be loaded and in the link map.

However I see your point about it acting as an extra check, the returned handle should never be NULL in which case I think exiting with fatalError() may be the correct action, something like:

void *handle = dlopen(imageName, RTLD_LAZY | RTLD_NOLOAD);
if (!handle) {
  fatalError(/* flags = */ 0, "dlopen() failed on `%s': %s", imageName,
             dlerror());
} else {
  void *symbol = dlsym(handle, sectionName);

does this look ok?

@jckarter
Copy link
Contributor

Sounds good. You could leave out the else since the nesting isn't necessary after the if errors out.

@spevans
Copy link
Contributor Author

spevans commented Jan 28, 2017

@jckarter Ive added the RTLD_NOLOAD and extra error checking

@jckarter
Copy link
Contributor

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 5c993b4
Test requested by - @jckarter

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 5c993b4
Test requested by - @jckarter

@jckarter
Copy link
Contributor

@swift-ci Please test

@jckarter
Copy link
Contributor

Looks great!

@spevans
Copy link
Contributor Author

spevans commented Jan 28, 2017

Are the CI failures just temporary issues? The log seemed to show it being an issue in lldb?

@jckarter
Copy link
Contributor

@spevans There's a known bug in the Jenkins Github plugin that it doesn't like rebases. You just need to trigger CI twice for it to get the right SHA, AIUI. (@shahmishal might have more background)

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 7d2a9aa
Test requested by - @jckarter

@jckarter
Copy link
Contributor

I see the lldb failure now. @egranata Are we disturbing the format of any metadata objects LLDB is looking at?

@jckarter
Copy link
Contributor

Seems to be correlated, since the builds right before the ones I started for this PR passed…

@spevans
Copy link
Contributor Author

spevans commented Jan 28, 2017

OK, I'll take a look and see if I can replicate the test failure

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 7d2a9aa
Test requested by - @jckarter

@spevans
Copy link
Contributor Author

spevans commented Jan 28, 2017

@jckarter Looks like the same test failed in the last CI of 'Project OSS - Swift Package - Ubuntu 16.10 (master)' https://ci.swift.org/job/oss-swift-package-linux-ubuntu-16_10/328/

@jckarter
Copy link
Contributor

OK, might be unrelated then. Let me try again.

@jckarter
Copy link
Contributor

@swift-ci Please test

@spevans
Copy link
Contributor Author

spevans commented Jan 29, 2017

@jckarter I think there is something broken in master for the linux tests as https://ci.swift.org/job/oss-swift-package-linux-ubuntu-16_10 has been failing for the last day or so

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 7d2a9aa
Test requested by - @jckarter

@jckarter
Copy link
Contributor

I agree. We can't merge additional PRs until master is clean, though. We'll have to wait for the lldb test to be fixed.

@jckarter
Copy link
Contributor

Master seems to be clean now, let's try again.

@jckarter
Copy link
Contributor

@swift-ci Please test Linux

@jckarter jckarter merged commit b8b7b5f into swiftlang:master Jan 31, 2017
@jckarter
Copy link
Contributor

Thanks again @spevans!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants