-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[runtime] Binary section data loading for extra ELF images #6097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jckarter couple of notes: I had another go at adding a struct SectionInfo into the swift_section.S but this time I added it as a new symbol and didnt change the existing ones. This meant I could get rid of the getSectionInfo(). If this isnt ideal I have another version keeping the old method I couldnt work out what would happen if a 2nd thread dlopend an image while dl_iterate_phdr() was in progress so I just removed all of the dlopen()/dl_iterate_phdr() and just used the constructors. This should have the advantage of eliminating lots of callbacks for images we arent interested in and also saves on dlsym() lookups |
:-( We're trying very hard to avoid static constructors. I don't think this is worth the tradeoff if a particular program never asks for conformances. |
Static constructors are the only public interface Linux and Windows' dynamic loader have for load-time hooks, so I think it's fine in this case, though we should be careful not to do any more work than absolutely necessary to register the new image with the runtime. |
(At the ntdll level, Windows does have |
void addImageProtocolConformanceBlock(const SectionInfo *block); | ||
void addImageTypeMetadataRecordBlock(const SectionInfo *block); | ||
|
||
#endif // defined(__ELF__) || defined(__ANDROID__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is worth changing. We're wasting space for pointers that can be trivially recovered.
loadSectionData() { | ||
addImageProtocolConformanceBlock(&__swift2_protocol_conformances_block); | ||
addImageTypeMetadataRecordBlock(&__swift2_type_metadata_block); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is doing too much work, since it will force initialization of the lookup caches even if they haven't been enabled yet, and is not robust ABI if libswiftCore.so
changes and requires new per-image registration hooks in the future. It would be better for the runtime to export an entry point swift_registerLoadedImage(const void *image)
, and have the static constructor do nothing but call that with a handle to the loaded image, something like:
SWIFT_RUNTIME_EXPORT extern "C" void swift_registerLoadedImage(const void *dsoHandle);
__attribute__((constructor))
static void onLoad() {
swift_registerLoadedImage(__dso_handle);
}
The runtime implementation of swift_registerLoadedImage
should be responsible for extracting and registering the sections that parts of the runtime are interested in, and do so only if those parts have already been initialized. It should be as close to a no-op as possible if an image is loaded before any lookup tables are initialized.
We should also make sure that this static constructor is not linked into executables or into libswiftCore.so
itself, since the executable and runtime by their nature have to have been loaded before any lookup caches are triggered.
Another approach that would maximize laziness and avoid the need for a static constructor, at the cost of making failed lookups a bit more expensive, would be to have the lookup caches fall back to re-iterating all loaded images using Alternatively, does the dynamic linker on Linux provide any generation counter or something else we could easily inspect to detect whether additional images had been |
6c79d2b
to
8c5f358
Compare
@jckarter I updated to remove the SectionInfo that was added to the swift_sections.S I also renamed some of the functions used by the static init construtor, they may have implied that the caches were being initialised at startup but the lazy initialisation is still being used. The static init construtor just saves the SectionInfo to a list and its not used until This also has the advantage that we dont need to avoid linking the static construtor into I had a look at glibc but I couldnt see any export of information about images loaded by |
This would still hardcode an exact set of operations into every dynamic library, which would prevent |
8c5f358
to
82627c2
Compare
Ive simplified the static constructor to just pass an address in the DSO to |
}, nullptr); | ||
|
||
lazyInitProtocolConformance(addr); | ||
lazyInitTypeMetadataRecord(addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code organization looks much better, thanks! Keep in mind though that __attribute__((constructor))
s fire not only at dlopen
time, but at process load time, so all this laziness will be wasted when addNewDSOImage
gets invoked early at process start. We need to keep track of whether each cache has been initialized to avoid doing any work if it hasn't. Something like this:
static std::atomic<bool> didInitializeProtocolConformanceLookup;
void swift::initializeProtocolConformanceLookup() {
// Set `didInitializeProtocolConformanceLookup` so that loading
// future Swift images will register with the lookup cache.
didInitializeProtocolConformanceLookup.store(true, std::memory_order_release);
// Search the loaded dls...
...
}
extern "C"
void swift::swift_addNewDSOImage(const void *addr) {
if (didInitializeProtocolConformanceLookup.load(std::memory_order_acquire)) {
addImageProtocolConformanceBlockCallback(...get section from addr...);
}
}
and a similar pattern for the type metadata lookup cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code actually relies on addNewDSOImage
being called before either cache is initialised as it creates a Mutex and array for each cache. Then addNewDSOImage
just stores an address in the DSO into each list. When a cache is initialised, the addresses in that cache's list are used to load the section data from the DSO, then the list is freed and its ptr set to nullptr
. Further addNewDSOImage
calls will then just load the section data for the cache rather than storing the address.
The per cache mutex is used in case a new DSO is loaded whilst the current DSOs are being processed.
Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. If that's the intended design, then swift_once
here isn't giving any benefit over a global constructor, since the entry point will be called at load time in almost any real program. I still think we should avoiding doing any work until the caches are used; it delays start time and wastes effort in a program that never uses the lookup mechanisms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if everything Swift is statically linked into the main executable? Does swift_addNewDSOImage()
get called? If it does not, does that break anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gparker If it is statically linked then addNewDSOImage()
is a nop and the cache initialise functions just access the sections directly, see https://github.com/apple/swift/blob/master/stdlib/public/runtime/ImageInspectionStatic.cpp
82627c2
to
a4bca6a
Compare
@jckarter Can you take another look at this? Thanks! |
4ed3713
to
81b7abd
Compare
@jckarter Ive had another look at this and used your suggestion for the constructor of just testing a 'lookup initialized' flag for each cache. This keeps I also fixed a bug with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a few minor requests:
}; | ||
|
||
// Called by injected constructors when a dynamic library is loaded. | ||
void addNewDSOImage(const void *addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is exported from the runtime, we should give it an extern "C"
name like swift_addNewDSOImage
, so that the runtime ABI isn't dependent on the C++ ABI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #6820 it's spelled SWIFT_RUNTIME_EXPORT
. That handles all of the mangling and visibility fuss required to provide a symbol that is called by outside code.
@@ -30,6 +30,18 @@ namespace swift { | |||
void *symbolAddress; | |||
}; | |||
|
|||
#if defined(__ELF__) || defined(__ANDROID__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we separate this #if block out into a separate ImageInspectionELF.h
header?
// Called from ImageInspectionInit | ||
void | ||
swift::addNewDSOImage(const void *addr) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't have to exist at all in a static executable. I'd rather we get a link error if we accidentally try to link in the static constructor stub than quietly link in a useless static constructor.
// that addNewDSOImage() sees a consistent state. If it was set outside the | ||
// dl_interate_phdr() call then it could result in images being missed or | ||
// added twice. | ||
inspectArgs->didInitializeLookup->store(true, std::memory_order_release); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's guaranteed that dlopen
and dl_iterate_phdr
hold the same lock, then the didInitializeLookup
flag should be able to be nonatomic.
static SectionInfo getSectionInfo(const char *imageName, | ||
const char *sectionName) { | ||
SectionInfo sectionInfo = { 0, nullptr }; | ||
void *handle = dlopen(imageName, RTLD_LAZY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also pass RTLD_NOLOAD
? The previous version did.
- For ELF targets, keep track of shared objects as they are dynamically loaded so that section data can be added to the protocol conformance and type metadata caches after initialisation (rdar://problem/19045112).
- Rename swift::addNewDSOImage() to swift_addNewDSOImage() and export using SWIFT_RUNTIME_EXPORT. - Move ELF specific parts of ImageInspection.h into ImageInspectionELF.h.
- Create separate swift_begin.o/swift_end.o for lib/swift and lib/swift_static. The static swift_begin.o does not call swift_addNewDSOImage() at startup. - Update ToolChains.cpp to use the correct swift_begin.o/swift_end.o files for the `-static-stdlib` and `-static-executable` options.
81b7abd
to
5c993b4
Compare
@jckarter I have fixed all the minor requests and rebased. I removed the |
@spevans What's the behavior if a different file is on disk at the same path from what was loaded if RTLD_NOLOAD isn't passed? Seems like a nice consistency check. |
@jckarter If the file is already loaded then then the filesystem isnt examined. Since the filename comes from either However I see your point about it acting as an extra check, the returned handle should never be
does this look ok? |
Sounds good. You could leave out the |
@jckarter Ive added the |
@swift-ci Please test |
Build failed |
Build failed |
@swift-ci Please test |
Looks great! |
Are the CI failures just temporary issues? The log seemed to show it being an issue in lldb? |
@spevans There's a known bug in the Jenkins Github plugin that it doesn't like rebases. You just need to trigger CI twice for it to get the right SHA, AIUI. (@shahmishal might have more background) |
Build failed |
I see the lldb failure now. @egranata Are we disturbing the format of any metadata objects LLDB is looking at? |
Seems to be correlated, since the builds right before the ones I started for this PR passed… |
OK, I'll take a look and see if I can replicate the test failure |
Build failed |
@jckarter Looks like the same test failed in the last CI of 'Project OSS - Swift Package - Ubuntu 16.10 (master)' https://ci.swift.org/job/oss-swift-package-linux-ubuntu-16_10/328/ |
OK, might be unrelated then. Let me try again. |
@swift-ci Please test |
@jckarter I think there is something broken in master for the linux tests as https://ci.swift.org/job/oss-swift-package-linux-ubuntu-16_10 has been failing for the last day or so |
Build failed |
I agree. We can't merge additional PRs until master is clean, though. We'll have to wait for the lldb test to be fixed. |
Master seems to be clean now, let's try again. |
@swift-ci Please test Linux |
Thanks again @spevans! |
Use static constructors for registering dlopen-ed images with the runtime.
to each ELF shared object to automatically add the section data
as the object is dynamically loaded (rdar://problem/19045112)