-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[IRGen] Mix -profile-generate into module hash #26774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci please smoke test |
This is for @eeckstein |
Thanks for working on this. To be on the safe side, should other types of instrumentation be included here as well (say, {a,t}san)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Question. Would it be possible to change this into some sort of exhaustive switch for Options? That way if someone adds a new option like this, they get a warning telling them that they may need to update this code. |
In other words, I am worried that the code as written will cause this same bug to occur again. If we had some sort of switch associated with it, at least we could get a compile time warning in such a case. This will cause the writer to at least think about... should this be used in the IR hash? |
It's only required for options that don't affect the IR as generated, but arguably being forced to say that explicitly would be nice. You could do this with a .def file but that might feel heavyweight. |
@jrose-apple exactly. My fear is that someone will just add the option and not think about it. But if one at least has to explicitly do something, that can not happen. |
@vedantk Good idea to check. |
@gottesmm @jrose-apple I don't think the problem here is that someone forgot to update the function when they added an option; I think it's that someone didn't think about this edge case of an empty file, or didn't know about LLVM's behavior in this situation. The way to prevent similar bugs would be to add a compiler mode which computes the module hash, then if it would normally skip compiling the IR, proceeds anyway and fails if the resulting assembly or |
We could do that in +Asserts builds. We're only losing incrementality by doing so. |
When rebuilding a .o file, we hash the IR generated by IRGen and compare it to a hash embedded in the .o file. If they match, we simply skip asking LLVM to optimize and generate code. Certain flags that control our LLVM configuration are mixed into the hash, but the -profile-generate flag was not one of them. This usually wouldn’t matter because profiling would insert additional code into the IR, but in edge cases like empty files or files containing only protocol declarations, it could cause linker errors when profiling was turned off. This change adds the `GenerateProfile` flag into the IR hash, ensuring that we always recompile, even if the IR was identical. Fixes rdar://problem/54126622.
We do a particular dance to diagnose errors twice, and we’re about to do it three times.
ef3868f
to
7c62629
Compare
This is probably the most complicated and expensive "assertion" I've ever written, but it's in and it works. I've also started to mix |
@swift-ci please test |
Build failed |
Build failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. cc @eeckstein
lib/IRGen/IRGen.cpp
Outdated
std::string message; | ||
if (0 != llvm::DiffFilesWithTolerance(OutputFilename, | ||
OriginalOutputFilename, | ||
/*tolerances=*/0, 0, &message)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like overkill. Check out swift::moveFileIfDifferent
.
This separates the “do these two files have the same contents?” logic from the “move or delete” logic in `moveFileIfDifferent()`, creating a useful helper function. It also ties the special-case behavior for the `destination` parameter to a flag, since we have a use where we won’t want that.
In assert builds, when performLLVM() would normally skip invoking LLVM because it believes it would generate the same code, it now generates code into a temporary file and compares it to the output. This should catch mistakes where Swift configures LLVM in ways which affect the output, but which the incremental logic doesn’t account for.
7c62629
to
7c043b8
Compare
include/swift/AST/IRGenOptions.h
Outdated
@@ -259,6 +259,8 @@ class IRGenOptions { | |||
unsigned Hash = (unsigned)OptMode; | |||
Hash = (Hash << 1) | DisableLLVMOptzns; | |||
Hash = (Hash << 1) | DisableSwiftSpecificLLVMOptzns; | |||
Hash = (Hash << 1) | GenerateProfile; | |||
Hash = (Hash << 1) | Sanitizers.contains(SanitizerKind::Fuzzer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly we might as well mix in the whole thing. It's not just about empty files; it's also about *non-*empty files that only change due to LLVM passes, and I'm not sure I trust that all the other sanitizers don't need this.
test/IRGen/Inputs/empty.swift
Outdated
@@ -0,0 +1 @@ | |||
// This space intentionally left blank. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: There's already a test/Inputs/empty.swift you can use.
Thanks for the refactor! |
@swift-ci please smoke test |
Replaces getLLVMCodeGenOptionsHash(), which combined a bunch of individual bits into a string, with writeLLVMCodeGenOptionsTo(), which writes each one separately into a raw_ostream. This is intended to be make the hashing more future-proof. The current design needs to know exactly how many bits each of the values needs; if any of the values grew and you forgot to update this function, its bits would interfere with those of an earlier value in the hash. This new design is expected to be slightly slower, but more robust to future change in the compiler.
Mixes the state of all sanitizers, not just the fuzzer, into the module hash used to decide whether to skip LLVM codegen. I don’t actually know of a case where one of the other sanitizers will generate identical IR and different machine code, but being defensive costs us very little.
@jrose-apple I mixed in all of the sanitizers, and also redesigned the hash slightly to avoid any issues when we add more sanitizers/optimization modes/values to these fields in general. |
@swift-ci please smoke test |
When rebuilding a .o file, we hash the IR generated by IRGen and compare it to a hash embedded in the .o file. If they match, we simply skip asking LLVM to optimize and generate code. Certain flags that control our LLVM configuration are mixed into the hash, but the -profile-generate flag was not one of them. This usually wouldn’t matter because profiling would insert additional code into the IR, but in edge cases like empty files or files containing only protocol declarations, it could cause linker errors when profiling was turned off.
This change adds the
GenerateProfile
flag into the IR hash, ensuring that we always recompile, even if the IR was identical.Fixes rdar://problem/54126622.