-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Add driver flag to precompile Swift-compatible explicit Clang modules. #28107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @DougGregor, could you add a reviewer to this please? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad you're working on this, because it's really important for us to be able to separately build the Clang modules on which a Swift compilation job depends. I feel like there is a bunch of code duplication here where the new build-a-Clang-module path is kept separate from similar code paths (in particular, bridging PCH generation) that is likely to stray. Specifically, I think I prefer integrating this path as another "frontend" action (the way we do for emitting a bridging PCH) so we aren't duplicating so much command-line parsing and setup logic, and get tools like output file maps and such for free. What do you think?
lib/ClangImporter/ClangImporter.cpp
Outdated
return LangOpts->CPlusPlus ? clang::Language::CXX : clang::Language::C; | ||
} | ||
|
||
bool ClangImporter::emitPrecompiledModule(StringRef moduleMapPath, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a significant amount of overlap between this and ClangImporter::emitBridgingPCH
. Can we unify these two functions, so the somewhat arcane knowledge of how to get Clang to do this properly isn't duplicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tools/driver/dump_pcm_main.cpp
Outdated
// This is functionally equivalent to Clang's -module-file-info frontend flag, | ||
// but is provided here to guarantee compatibility with modules emitted by the | ||
// Swift compiler since those modules will not be compatible with a different | ||
// version of Clang (even for debug-printing). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought Clang's -module-file-info
was based on relatively stable parts of the Clang module file format? Personally, Could we rely on that for the "dump a module's metadata for debugging" case, rather than have this much Clang-specific logic in Swift?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the Clang used to generate the module must be the same version as the one that runs -module-file-info
on it, or you get an error like:
error: PCH file built from a different branch () than the compiler ((clang-1100.0.33.8))
And Xcode provides no guarantee that its clang
binary and the Clang linked into its swiftc
binary will be the same version; over the past few versions of Swift, it's more often been the case that they're slightly different revisions. For example, Xcode 11's clang
is clang-1100.0.33.8
but swiftc
's clang is clang-1100.0.33.7
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for correcting my faulty memory. I agree it makes sense to have this functionality in Swift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've turned this into a -dump-pcm
no-output mode in the standard driver/frontend.
tools/driver/emit_pcm_main.cpp
Outdated
} | ||
|
||
StringRef Filename = Invocation.ModuleMapFilename; | ||
auto ErrOrBuf = llvm::MemoryBuffer::getFile(Filename); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be going through the virtual file system rather than poking directly at the file system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, probably. I copied this from the modulewrap main, but that might be a different situation since the input file to that action is always an output from a previous action? (Or maybe that should also be updated at some point?)
If I migrate this to a frontend action, then this separate check can probably go away and be folded into something more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think modulewrap main should also be updated at some point; it's part of the reason that I think it's useful to make this a frontend action, because it's easier to get this stuff set up properly.
tools/driver/emit_pcm_main.cpp
Outdated
} | ||
}; | ||
|
||
int emit_pcm_main(ArrayRef<const char *> Args, const char *Argv0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a large amount of code to effectively make this its own driver, which I'm concerned will get out of sync with the way we import the resulting Clang modules in subtle ways. Could we instead handle this the way the emission of the precompiled bridging header is handled, as a frontend command that relies on all of the same command-line option parsing and setup as a compile job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making it similar to -emit-pch
makes sense as a frontend job, especially if you want the Swift driver to at some point in the future be able to construct explicit PCM jobs as part of a compilation, which is a use case I hadn't considered.
For my use case (integration with Bazel), I want to invoke this completely independently, so I'd want it to be a stable driver invocation as well; for example, -emit-pch
is a frontend-only flag because that job is only created under the hood when -enable-bridging-pch
is passed to the driver, so I wouldn't want Bazel's integration to rely on invoking the frontend directly since that's considered unstable. As a standalone invocation, I'd want swiftc -emit-pcm -module-name Foo path/to/some/module.modulemap
to construct only the PCM job and ensure that only modulemap files are supplied as inputs.
Are there any frontend actions other than compilation that I can look at for guidance to model this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up emulating what was done in FrontendTool for -emit-pch
, but also made the flag available directly to the driver. So now if -emit-pcm
(or -dump-pcm
) is used, it's treated as a SingleCompile
action type, and FrontendTool checks that flag and hands off control to ClangImporter.
I thought I would also need to create a special FrontendAction
subclass like GeneratePCHAction
, but that ended up not being necessary—it looks like that would only be needed if the driver wanted to register PCM actions separately as part of a larger compilation job?
Please let me know if this updated version is closer to what you had in mind!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that having swiftc -emit-pcm -module-name Foo path/to/some/module.modulemap
effectively create only a PCM-emitting job makes sense. The "compilation" path is really the right one for the driver here, because you want a lot of the same code to pass through (VFS, output file maps, most compilation options, etc.).
This seems to be working for some basic modules (like the ones in the test case in the PR), but I'm seeing some odd behavior (an assertion failure) when I try to build and then consume explicit modules for some of the Apple SDK system frameworks/modules. (We need to turn on In order to do this, I have the following modules that I build explicitly with
Building
I threw some Then, I noticed the differing semantics of This behavior happened both at HEAD, and in the 5.1 branch. I didn't observe it in an older prototype that I've been experimenting with internally, where I'm using sources that date back to sometime in June (I'll have to fish out more specific timestamps). I don't think what I'm seeing is related to my PR specifically, but it's something I'd like to figure out before we're able to make real use of this. Is the behavior I'm observing (especially the difference between |
Following up to my previous comment, I observe the same behavior if I build and consume the same explicit modules using the standalone So, at least the issue appears to be unrelated to this PR or to Swift/ClangImporter. I'll see if I can get the same behavior on upstream clang next. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this looks great to me!
Thanks Doug! If there's nothing else blocking, can you kick off the tests and merge if successful? I don't have permissions to do so. |
@swift-ci please test |
Some additional context/discussion can be found in this Swift Forums thread.
This change adds two flags to the Swift driver:
-emit-pcm
, which takes as input a module map and generates an explicit precompiled module using a Clang compiler instance from Swift's ClangImporter.-dump-pcm
, which takes as input a .pcm file generated by the Swift compiler (either by-emit-pcm
, or one from the implicit module cache) and prints info about it as a debugging aid (equivalent toclang -module-file-info
)Such a .pcm file can be passed to
swiftc
using-Xcc -fmodule-file=path/to/module.pcm
and imported like any other C/Objective-C module.Example usage
Motivation
When distributing Swift builds consisting of large numbers of Objective-C and Swift modules remotely with build systems like Bazel, we can't rely on the implicit module cache to reduce repeated work—the cache won't be the same from machine to machine, and it's not possible to share it because absolute paths to the workspaces will be different (and those paths are written into the modules).
This means that each
swiftc
job's ClangImporter must redo the work of parsing the C/Objective-C headers used by a Swift module, which greatly slows down our remote builds. For dependency graphs with large amounts of Objective-C, we a significant amount of time reported by the compiler spent in Name Binding, and file system traces indicate that much of that is processing headers.Instead, we'd like to explicitly precompile those C modules and propagate the .pcm artifacts as part of the build graph, thus doing the work only once. Since precompiled modules are not binary-compatible across different versions of Clang or across invocations where the non-benign invocation options differ, the only stable way to make a .pcm that is compatible with Swift is to extend
swiftc
to have its own ClangImporter instance do it.