[IRGen] Mix -profile-generate into module hash #26774

beccadax · 2019-08-22T01:11:21Z

When rebuilding a .o file, we hash the IR generated by IRGen and compare it to a hash embedded in the .o file. If they match, we simply skip asking LLVM to optimize and generate code. Certain flags that control our LLVM configuration are mixed into the hash, but the -profile-generate flag was not one of them. This usually wouldn’t matter because profiling would insert additional code into the IR, but in edge cases like empty files or files containing only protocol declarations, it could cause linker errors when profiling was turned off.

This change adds the GenerateProfile flag into the IR hash, ensuring that we always recompile, even if the IR was identical.

Fixes rdar://problem/54126622.

beccadax · 2019-08-22T01:11:37Z

@swift-ci please smoke test

gottesmm · 2019-08-22T01:55:20Z

This is for @eeckstein

vedantk · 2019-08-22T02:19:42Z

Thanks for working on this. To be on the safe side, should other types of instrumentation be included here as well (say, {a,t}san)?

eeckstein

lgtm!

gottesmm · 2019-08-22T15:42:02Z

Question. Would it be possible to change this into some sort of exhaustive switch for Options? That way if someone adds a new option like this, they get a warning telling them that they may need to update this code.

gottesmm · 2019-08-22T15:43:27Z

In other words, I am worried that the code as written will cause this same bug to occur again. If we had some sort of switch associated with it, at least we could get a compile time warning in such a case. This will cause the writer to at least think about... should this be used in the IR hash?

jrose-apple · 2019-08-22T16:52:09Z

It's only required for options that don't affect the IR as generated, but arguably being forced to say that explicitly would be nice. You could do this with a .def file but that might feel heavyweight.

gottesmm · 2019-08-22T17:05:13Z

@jrose-apple exactly. My fear is that someone will just add the option and not think about it. But if one at least has to explicitly do something, that can not happen.

beccadax · 2019-08-22T17:25:59Z

@vedantk Good idea to check. -sanitize=address and -sanitize=thread emit different IR for an empty source file. -sanitize=undefined emits the same IR, but also the same assembly on x86_64 at least. -sanitize=fuzzer emits the same IR but different assembly, so that one should be mixed in too.

beccadax · 2019-08-22T17:32:53Z

@gottesmm @jrose-apple I don't think the problem here is that someone forgot to update the function when they added an option; I think it's that someone didn't think about this edge case of an empty file, or didn't know about LLVM's behavior in this situation. -profile-generate doesn't usually need to be mixed into the flags because it will normally cause IR changes; only in files with no executable code does it behave incorrectly. If there had been a mechanism to make the programmer think about whether this flag needed to be mixed into the IR hash, they would have thought about it and chosen "No".

The way to prevent similar bugs would be to add a compiler mode which computes the module hash, then if it would normally skip compiling the IR, proceeds anyway and fails if the resulting assembly or .o file are different from the cached version. Then we can encourage folks who add a flag to add tests running the compiler in that mode against some battery of input files.

jrose-apple · 2019-08-22T17:45:23Z

We could do that in +Asserts builds. We're only losing incrementality by doing so.

When rebuilding a .o file, we hash the IR generated by IRGen and compare it to a hash embedded in the .o file. If they match, we simply skip asking LLVM to optimize and generate code. Certain flags that control our LLVM configuration are mixed into the hash, but the -profile-generate flag was not one of them. This usually wouldn’t matter because profiling would insert additional code into the IR, but in edge cases like empty files or files containing only protocol declarations, it could cause linker errors when profiling was turned off. This change adds the `GenerateProfile` flag into the IR hash, ensuring that we always recompile, even if the IR was identical. Fixes rdar://problem/54126622.

We do a particular dance to diagnose errors twice, and we’re about to do it three times.

beccadax · 2019-08-22T23:07:43Z

This is probably the most complicated and expensive "assertion" I've ever written, but it's in and it works. I've also started to mix -sanitize=fuzzer into the hash.

beccadax · 2019-08-22T23:22:17Z

@swift-ci please test

swift-ci · 2019-08-22T23:22:48Z

Build failed
Swift Test OS X Platform
Git Sha - ef3868fba3b7c8bf548ea413a3e33cc5e297f9fb

swift-ci · 2019-08-22T23:24:33Z

Build failed
Swift Test Linux Platform
Git Sha - ef3868fba3b7c8bf548ea413a3e33cc5e297f9fb

jrose-apple

Nice work. cc @eeckstein

lib/IRGen/IRGen.cpp

jrose-apple · 2019-08-22T23:24:46Z

lib/IRGen/IRGen.cpp

+    std::string message;
+    if (0 != llvm::DiffFilesWithTolerance(OutputFilename,
+                                          OriginalOutputFilename,
+                                          /*tolerances=*/0, 0, &message)) {


This seems like overkill. Check out swift::moveFileIfDifferent.

lib/IRGen/IRGen.cpp

This separates the “do these two files have the same contents?” logic from the “move or delete” logic in `moveFileIfDifferent()`, creating a useful helper function. It also ties the special-case behavior for the `destination` parameter to a flag, since we have a use where we won’t want that.

In assert builds, when performLLVM() would normally skip invoking LLVM because it believes it would generate the same code, it now generates code into a temporary file and compares it to the output. This should catch mistakes where Swift configures LLVM in ways which affect the output, but which the incremental logic doesn’t account for.

jrose-apple · 2019-08-23T15:30:23Z

include/swift/AST/IRGenOptions.h

@@ -259,6 +259,8 @@ class IRGenOptions {
    unsigned Hash = (unsigned)OptMode;
    Hash = (Hash << 1) | DisableLLVMOptzns;
    Hash = (Hash << 1) | DisableSwiftSpecificLLVMOptzns;
+    Hash = (Hash << 1) | GenerateProfile;
+    Hash = (Hash << 1) | Sanitizers.contains(SanitizerKind::Fuzzer);


Honestly we might as well mix in the whole thing. It's not just about empty files; it's also about *non-*empty files that only change due to LLVM passes, and I'm not sure I trust that all the other sanitizers don't need this.

jrose-apple · 2019-08-23T15:32:56Z

test/IRGen/Inputs/empty.swift

@@ -0,0 +1 @@
+// This space intentionally left blank.


Nitpick: There's already a test/Inputs/empty.swift you can use.

jrose-apple · 2019-08-23T15:33:21Z

Thanks for the refactor!

beccadax · 2019-08-23T17:37:33Z

@swift-ci please smoke test

Replaces getLLVMCodeGenOptionsHash(), which combined a bunch of individual bits into a string, with writeLLVMCodeGenOptionsTo(), which writes each one separately into a raw_ostream. This is intended to be make the hashing more future-proof. The current design needs to know exactly how many bits each of the values needs; if any of the values grew and you forgot to update this function, its bits would interfere with those of an earlier value in the hash. This new design is expected to be slightly slower, but more robust to future change in the compiler.

Mixes the state of all sanitizers, not just the fuzzer, into the module hash used to decide whether to skip LLVM codegen. I don’t actually know of a case where one of the other sanitizers will generate identical IR and different machine code, but being defensive costs us very little.

beccadax · 2019-08-23T22:30:17Z

@jrose-apple I mixed in all of the sanitizers, and also redesigned the hash slightly to avoid any issues when we add more sanitizers/optimization modes/values to these fields in general.

beccadax · 2019-08-23T22:31:20Z

@swift-ci please smoke test

beccadax requested a review from gottesmm August 22, 2019 01:11

beccadax requested a review from vedantk August 22, 2019 01:26

eeckstein approved these changes Aug 22, 2019

View reviewed changes

beccadax added 2 commits August 22, 2019 16:02

[NFC] Extract helper from performLLVM()

5cd505a

We do a particular dance to diagnose errors twice, and we’re about to do it three times.

beccadax force-pushed the raise-the-profile branch from ef3868f to 7c62629 Compare August 22, 2019 23:03

beccadax requested a review from jrose-apple August 22, 2019 23:08

jrose-apple reviewed Aug 22, 2019

View reviewed changes

beccadax added 3 commits August 22, 2019 19:37

Mix fuzzer flag into IRGen module hash

7c043b8

beccadax force-pushed the raise-the-profile branch from 7c62629 to 7c043b8 Compare August 23, 2019 03:04

jrose-apple reviewed Aug 23, 2019

View reviewed changes

beccadax added 3 commits August 23, 2019 15:17

[NFC] Remove redundant empty test case file

8fa1ee1

beccadax merged commit 3fe9333 into swiftlang:master Aug 24, 2019

[IRGen] Mix -profile-generate into module hash #26774

[IRGen] Mix -profile-generate into module hash #26774

Uh oh!

Conversation

beccadax commented Aug 22, 2019

Uh oh!

beccadax commented Aug 22, 2019

Uh oh!

gottesmm commented Aug 22, 2019

Uh oh!

vedantk commented Aug 22, 2019

Uh oh!

eeckstein left a comment

Choose a reason for hiding this comment

Uh oh!

gottesmm commented Aug 22, 2019

Uh oh!

gottesmm commented Aug 22, 2019

Uh oh!

jrose-apple commented Aug 22, 2019

Uh oh!

gottesmm commented Aug 22, 2019

Uh oh!

beccadax commented Aug 22, 2019

Uh oh!

beccadax commented Aug 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jrose-apple commented Aug 22, 2019

Uh oh!

beccadax commented Aug 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beccadax commented Aug 22, 2019

Uh oh!

swift-ci commented Aug 22, 2019

Uh oh!

swift-ci commented Aug 22, 2019

Uh oh!

jrose-apple left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jrose-apple Aug 22, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jrose-apple Aug 23, 2019

Choose a reason for hiding this comment

Uh oh!

jrose-apple Aug 23, 2019

Choose a reason for hiding this comment

Uh oh!

jrose-apple commented Aug 23, 2019

Uh oh!

beccadax commented Aug 23, 2019

Uh oh!

beccadax commented Aug 23, 2019

Uh oh!

beccadax commented Aug 23, 2019

Uh oh!

Uh oh!

beccadax commented Aug 22, 2019 •

edited

Loading

beccadax commented Aug 22, 2019 •

edited

Loading