-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Sanitizer] Make sanitizer passes idempotent #99439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-compiler-rt-sanitizer Author: Chaitanya (skc7) ChangesWhen address sanitizer pass is run after it has already been run before, double instrumentation is seen in the resulting IR. This happens because there is no check in the pass, to verify if IR has been asan instrumented before. This PR checks if asan_module_ctor is already present in the module and if present, return early without running the pass again. Full diff: https://github.com/llvm/llvm-project/pull/99439.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
index adf77f20cb1c7..30752e23073f7 100644
--- a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -1249,8 +1249,22 @@ AddressSanitizerPass::AddressSanitizerPass(
UseOdrIndicator(UseOdrIndicator), DestructorKind(DestructorKind),
ConstructorKind(ConstructorKind) {}
+static bool hasAsanModuleCtor(Module &M) {
+ for (Function &F : M) {
+ if (F.getName() == kAsanModuleCtorName) {
+ return true;
+ }
+ }
+ return false;
+}
+
PreservedAnalyses AddressSanitizerPass::run(Module &M,
ModuleAnalysisManager &MAM) {
+ // Return early if asan.module_ctor is already present in the module.
+ // This implies that asan pass has already run before.
+ if (hasAsanModuleCtor(M))
+ return PreservedAnalyses::all();
+
ModuleAddressSanitizer ModuleSanitizer(
M, Options.InsertVersionCheck, Options.CompileKernel, Options.Recover,
UseGlobalGC, UseOdrIndicator, DestructorKind, ConstructorKind);
diff --git a/llvm/test/Instrumentation/AddressSanitizer/asan-pass-second-run.ll b/llvm/test/Instrumentation/AddressSanitizer/asan-pass-second-run.ll
new file mode 100644
index 0000000000000..65d72a72f605b
--- /dev/null
+++ b/llvm/test/Instrumentation/AddressSanitizer/asan-pass-second-run.ll
@@ -0,0 +1,56 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 5
+; This test checks in the second run, function is not instrumented again.
+; RUN: opt < %s -passes=asan,asan -S | FileCheck %s
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Function with sanitize_address is instrumented.
+; Function Attrs: nounwind uwtable
+;.
+; CHECK: @llvm.used = appending global [1 x ptr] [ptr @asan.module_ctor], section "llvm.metadata"
+; CHECK: @___asan_globals_registered = common hidden global i64 0
+; CHECK: @__start_asan_globals = extern_weak hidden global i64
+; CHECK: @__stop_asan_globals = extern_weak hidden global i64
+; CHECK: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @asan.module_ctor, ptr @asan.module_ctor }]
+;.
+define void @instr_sa(ptr %a) sanitize_address {
+; CHECK-LABEL: define void @instr_sa(
+; CHECK-SAME: ptr [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[TMP0:%.*]] = ptrtoint ptr [[A]] to i64
+; CHECK-NEXT: [[TMP1:%.*]] = lshr i64 [[TMP0]], 3
+; CHECK-NEXT: [[TMP2:%.*]] = add i64 [[TMP1]], 2147450880
+; CHECK-NEXT: [[TMP3:%.*]] = inttoptr i64 [[TMP2]] to ptr
+; CHECK-NEXT: [[TMP4:%.*]] = load i8, ptr [[TMP3]], align 1
+; CHECK-NEXT: [[TMP5:%.*]] = icmp ne i8 [[TMP4]], 0
+; CHECK-NEXT: br i1 [[TMP5]], label %[[BB6:.*]], label %[[BB12:.*]], !prof [[PROF0:![0-9]+]]
+; CHECK: [[BB6]]:
+; CHECK-NEXT: [[TMP7:%.*]] = and i64 [[TMP0]], 7
+; CHECK-NEXT: [[TMP8:%.*]] = add i64 [[TMP7]], 3
+; CHECK-NEXT: [[TMP9:%.*]] = trunc i64 [[TMP8]] to i8
+; CHECK-NEXT: [[TMP10:%.*]] = icmp sge i8 [[TMP9]], [[TMP4]]
+; CHECK-NEXT: br i1 [[TMP10]], label %[[BB11:.*]], label %[[BB12]]
+; CHECK: [[BB11]]:
+; CHECK-NEXT: call void @__asan_report_load4(i64 [[TMP0]]) #[[ATTR3:[0-9]+]]
+; CHECK-NEXT: unreachable
+; CHECK: [[BB12]]:
+; CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr [[A]], align 4
+; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1
+; CHECK-NEXT: store i32 [[TMP2]], ptr [[A]], align 4
+; CHECK-NEXT: ret void
+;
+entry:
+ %tmp1 = load i32, ptr %a, align 4
+ %tmp2 = add i32 %tmp1, 1
+ store i32 %tmp2, ptr %a, align 4
+ ret void
+}
+;.
+; CHECK: attributes #[[ATTR0]] = { sanitize_address }
+; CHECK: attributes #[[ATTR1:[0-9]+]] = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
+; CHECK: attributes #[[ATTR2:[0-9]+]] = { nounwind }
+; CHECK: attributes #[[ATTR3]] = { nomerge }
+;.
+; CHECK: [[PROF0]] = !{!"branch_weights", i32 1, i32 1048575}
+;.
|
@@ -1249,8 +1249,22 @@ AddressSanitizerPass::AddressSanitizerPass( | |||
UseOdrIndicator(UseOdrIndicator), DestructorKind(DestructorKind), | |||
ConstructorKind(ConstructorKind) {} | |||
|
|||
static bool hasAsanModuleCtor(Module &M) { | |||
for (Function &F : M) { | |||
if (F.getName() == kAsanModuleCtorName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is M.getFunction
@@ -0,0 +1,56 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 5 | |||
; This test checks in the second run, function is not instrumented again. | |||
; RUN: opt < %s -passes=asan,asan -S | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed (the patch)?
Time to time we had discussion to do exactly like that, but usually it's not hard to make sure that pass is unique in pipeline
There is also ConstructorKind::kNone when constructor is not inserted.
Inconsistency with other sanitizers is also not nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We found this issue when a HIP program is first compiled to bitcode with -fsanitize=address, then the bitcode is compiled to object file with -fsanitize=address again. This lead to calculation of shadow memory twice due to double instrumentation.
As you pointed out, ConstructorKind::None can lead to asan.module_ctor to be not present in the module. Do you suggest any other way to check if module is instrumented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vitalybuka I think it's a good idea to make all sanitizers idempotent. Your guidance on preferred approaches would be appreciated. With separate compilation, runtime compilation, etc. it seems hard to always "make sure that pass is unique in the pipeline".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have some generic approach, which will work regardless of constructors.
How about about Module::addModuleFlag("nosanitize") which will be set and checked by every sanitizer?
I am not sure if this require RFC on discourse. @nikic @efriedma-quic WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have updated PR to use "nosanitize" module flag. Could also check the return value for key "nosanitize". Currently its set to "1" at end of asan pass run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure a generic nosanitize module flag makes sense. Things will break if there are two sanitizers that aren't mutually exclusive and both try to use the flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikic Could something like below be done so that sanitizers don't confuse with just "nosanitize" flag?
Add module flag "nosanitize" with a key that is enum SanitizerKind (with all supported sanitizers listed in it), defined in common header file for sanitizers. So, before running the pass, query the "nosanitize" flag and check that corresponding key value is present, to check if the pass has previously instrumented the IR.
M.addModuleFlag(Module::ModFlagBehavior::Override, "nosanitize", SanitizerKind::Address);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd want a separate module flag for each sanitizer.
I'm a little skeptical that you pass pipeline is really doing what you want if you end up running sanitizer passes twice; that indicates you're running a bunch of other optimization passes twice without a good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instrumenting a shadow load practically guarantees an immediate memory fault or other catastrophic behavior. I assume any optimization passes that could cause such trouble are already idempotent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikic We don't have such mixing-compartible instrumenting sanitizers yet, but having unique flags is LGTM
@efriedma-quic all previous proposals to add such "re-run" protection ended up with "the pipeline is wrong". But implementation is so trivial, so I don't mind just to have it implemented.
e7afa9b
to
9cff0d1
Compare
2bf29ae
to
1b0b12f
Compare
@@ -1267,6 +1272,8 @@ PreservedAnalyses AddressSanitizerPass::run(Module &M, | |||
Modified |= FunctionSanitizer.instrumentFunction(F, &TLI); | |||
} | |||
Modified |= ModuleSanitizer.instrumentModule(M); | |||
if (Modified) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably can set it right after
if (M.getModuleFlag("nosanitize_address"))
return PreservedAnalyses::all();
or we can endup with cases when some modules will be modified on first pass, other on a second. Unless we know a good reason, if the are multiple sanitizer in pipeline, only the first one is allowed to modify.
return PreservedAnalyses::all();
should be fine even if we just added the flag.
ThreadSanitizer is missing |
Have added the flag for tsan-module, But not sure about tsan, which is function pass. How do we make that idempotent? |
We merged Module and function passes for other sanitizers, and we should do for tsan as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but please git some time others to respond
LGTM |
While the implementation is trivial, I also share the concern whether sanitizers should do this. There are many other instrumentation passes. Do they all need something like this? I feel this change hasn't been well motivated. |
In the worse case we need this, I'd hope that this is target-specific. |
@MaskRay can you talk more about your concerns? This won't affect compile time, and instrumenting a shadow load will cause an immediate crash of the application at runtime. This patch seems much more robust than counting on toolchains to always arrange to avoid double instrumentation. |
From Eli: I'm a little skeptical that you pass pipeline is really doing what you want if you end up running sanitizer passes twice; that indicates you're running a bunch of other optimization passes twice without a good reason. The other passes will affect compile time. There are many more instrumentation passes than sanitizers. I am concerned this leads us to the rabbit hole to add others (or others would blindly copy the pattern here, when the use cases aren't recommended/endorsed for other platforms). |
What does this mean? |
If we wanted to prevent for real we'd implement a diagnostics. The current state looks worse than the state with the patch. Maybe this way?
|
d36d58f
to
2bcca88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, no strong opinion on overall approach.
@@ -19,6 +19,7 @@ | |||
#include "llvm/IR/Function.h" | |||
#include "llvm/IR/IRBuilder.h" | |||
#include "llvm/IR/Instruction.h" | |||
#include "llvm/IR/Module.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a forward-declare.
"Redundant instrumentation detected, with module flag: " + | ||
std::string(flag); | ||
M.getContext().diagnose( | ||
DiagnosticInfoInlineAsm(diagInfo, DiagnosticSeverity::DS_Warning)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem related to inline asm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find existing DiagInfo classes for reporting IR instrumentation. Have created DiagnosticInfoInstrumentation in latest commit.
cl::desc("Ignore redundant instrumentation"), cl::Hidden, cl::init(false)); | ||
|
||
/// Check if module has flag attached, if not add the flag. | ||
bool llvm::checkIfAlreadyInstrumented(Module &M, StringRef flag) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool llvm::checkIfAlreadyInstrumented(Module &M, StringRef flag) { | |
bool llvm::checkIfAlreadyInstrumented(Module &M, StringRef Flag) { |
@vitalybuka Updated PR with changes you have suggested. Please review. |
@vitalybuka @nikic Let me know if anything needs to be changed in this PR? Else, will merge the PR after approval. |
650bc7e
to
3c7c677
Compare
Please review. |
you should click "re-request review" button next to review. |
I will take another look tomorrow. |
} // namespace | ||
|
||
/// Check if module has flag attached, if not add the flag. | ||
bool llvm::checkIfAlreadyInstrumented(Module &M, StringRef Flag) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early return please and DS_Error?
bool llvm::checkIfAlreadyInstrumented(Module &M, StringRef Flag) { | |
bool llvm::checkIfAlreadyInstrumented(Module &M, StringRef Flag) { | |
if (!M.getModuleFlag(Flag)) { | |
M.addModuleFlag(Module::ModFlagBehavior::Override, Flag, 1); | |
return false; | |
} | |
if (ClIgnoreRedundantInstrumentation) | |
return true; | |
std::string diagInfo = | |
"Redundant instrumentation detected, with module flag: " + | |
std::string(Flag); | |
M.getContext().diagnose(DiagnosticInfoInstrumentation( | |
diagInfo, DiagnosticSeverity::DS_Error)); | |
return true; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Have updated the patch in the latest commit.
Using DS_Error, causes second run of the pass to abort with error. So retained DS_Warning, so this returns true without abort and sanitizer passes return without doing second instrumentation.
I remain concerned that newly developed instrumentations may require this special property. I'm also worried that this change could mask potential pass pipeline issues arising from duplicate instrumentations. |
ClIgnoreRedundantInstrumentation is false by default. |
This is also not HIP specific. Anyway as @vitalybuka noted, the default is to complain. |
OK, I see. Instrumentation.cpp reports an error by default. This addresses my concern, but now the subject and commit message are incorrect. |
3c7c677
to
b7c0520
Compare
Sanizer passes setting a "no sanitizer" magic variable is backwards. If this behaviour is the way to go, have clang set a "needs_asan_lowering" or whatever and have the corresponding pass remove it. It shouldn't be necessary to emit ever increasing lists of pass and target specific cruft in the IR to avoid miscompilation. The opposite way round is much better - compile correctly when the flag is missing, and add the ad hoc metadata to switch on non-default behaviour |
The patch was merged without fixing the misleading subject and commit message: "[Sanitizer] Make sanitizer passes idempotent". I see that the related patch #87265 (I know nothing about it) has "Request Changes". Is it right to merge this? |
This patch was intended to make the second run(or subsequent runs) of sanitizer passes not change any IR (make them idempotent). Is there any issue with the commit message? |
FYI, this PR causes #105569. |
This PR changes the sanitizer passes to be idempotent. When any sanitizer pass is run after it has already been run before, double instrumentation is seen in the resulting IR. This happens because there is no check in the pass, to verify if IR has been instrumented before. This PR checks if "nosanitize_*" module flag is already present and if true, return early without running the pass again. Change-Id: I0aa0e8fa0d5c4b00721d2721606a4a1c10dc7d29
This PR changes the sanitizer passes to be idempotent. When any sanitizer pass is run after it has already been run before, double instrumentation is seen in the resulting IR. This happens because there is no check in the pass, to verify if IR has been instrumented before. This PR checks if "nosanitize_*" module flag is already present and if true, return early without running the pass again.
This PR changes the sanitizer passes to be idempotent.
When any sanitizer pass is run after it has already been run before, double instrumentation is seen in the resulting IR. This happens because there is no check in the pass, to verify if IR has been instrumented before.
This PR checks if "nosanitize_*" module flag is already present and if true, return early without running the pass again.