-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[analyzer] Retry UNDEF Z3 queries 2 times by default #120239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-static-analyzer-1 Author: Balazs Benics (steakhal) ChangesIf we have a refutation Z3 query timed out (UNDEF), allow a couple of retries to improve stability of the query. By default allow 2 retries. Retries should help mitigating flaky Z3 queries. Note that with each retry, we spend more time per query. Currently, we have a 15 seconds timeout per query - which are also in effect for the retry attempts. CPP-5920 Full diff: https://github.com/llvm/llvm-project/pull/120239.diff 7 Files Affected:
diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index d8a7c755c95968..46ff91f488fb5c 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -193,6 +193,8 @@ ANALYZER_OPTION(
"with \"crosscheck-with-z3-timeout-threshold\" of 300 ms, would nicely "
"guarantee that no bug report equivalence class can take longer than "
"1 second, effectively mitigating Z3 hangs during refutation. "
+ "If there were Z3 retries, only the minimum query time is considered "
+ "when accumulating query times within a report equivalence class. "
"Set 0 for no timeout.", 0)
ANALYZER_OPTION(
@@ -213,6 +215,15 @@ ANALYZER_OPTION(
"400'000 should on average make Z3 queries run for up to 100ms on modern "
"hardware. Set 0 for unlimited.", 0)
+ANALYZER_OPTION(
+ unsigned, Z3CrosscheckRetriesOnTimeout,
+ "crosscheck-with-z3-retries-on-timeout",
+ "Set how many times the oracle is allowed to retry a Z3 query. "
+ "Set 0 for not allowing retries, in which case each Z3 query would be "
+ "attempted only once. Increasing the number of retries is often more "
+ "effective at reducing the number of nondeterministic diagnostics than "
+ "\"crosscheck-with-z3-timeout-threshold\" in practice.", 2)
+
ANALYZER_OPTION(bool, ShouldReportIssuesInMainSourceFile,
"report-in-main-source-file",
"Whether or not the diagnostic report should be always "
diff --git a/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp b/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
index 739db951b3e185..7f4d7a93ee1b11 100644
--- a/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
+++ b/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
@@ -21,6 +21,9 @@
#define DEBUG_TYPE "Z3CrosscheckOracle"
+// Queries retried at most `Z3CrosscheckRetriesOnTimeout` number of times if the
+// `check()` call returns `UNDEF` for any reason. Each query is only counted
+// once for these statistics, the retries are not accounted for.
STATISTIC(NumZ3QueriesDone, "Number of Z3 queries done");
STATISTIC(NumTimesZ3TimedOut, "Number of times Z3 query timed out");
STATISTIC(NumTimesZ3ExhaustedRLimit,
@@ -77,16 +80,33 @@ void Z3CrosscheckVisitor::finalizeVisitor(BugReporterContext &BRC,
RefutationSolver->addConstraint(SMTConstraints);
}
- // And check for satisfiability
- llvm::TimeRecord Start = llvm::TimeRecord::getCurrentTime(/*Start=*/true);
- std::optional<bool> IsSAT = RefutationSolver->check();
- llvm::TimeRecord Diff = llvm::TimeRecord::getCurrentTime(/*Start=*/false);
- Diff -= Start;
- Result = Z3Result{
- IsSAT,
- static_cast<unsigned>(Diff.getWallTime() * 1000),
- RefutationSolver->getStatistics()->getUnsigned("rlimit count"),
+ auto GetUsedRLimit = [](const llvm::SMTSolverRef &Solver) {
+ return Solver->getStatistics()->getUnsigned("rlimit count");
+ };
+
+ auto AttemptOnce = [&](const llvm::SMTSolverRef &Solver) -> Z3Result {
+ constexpr auto getCurrentTime = llvm::TimeRecord::getCurrentTime;
+ unsigned InitialRLimit = GetUsedRLimit(Solver);
+ double Start = getCurrentTime(/*Start=*/true).getWallTime();
+ std::optional<bool> IsSAT = Solver->check();
+ double End = getCurrentTime(/*Start=*/false).getWallTime();
+ return {
+ IsSAT,
+ static_cast<unsigned>((End - Start) * 1000),
+ GetUsedRLimit(Solver) - InitialRLimit,
+ };
};
+
+ // And check for satisfiability
+ unsigned MinQueryTimeAcrossAttempts = std::numeric_limits<unsigned>::max();
+ unsigned NumRetries = Opts.Z3CrosscheckRetriesOnTimeout;
+ for (unsigned Attempt = 1; Attempt <= 1 + NumRetries; ++Attempt) {
+ Result = AttemptOnce(RefutationSolver);
+ Result.Z3QueryTimeMilliseconds =
+ std::min(MinQueryTimeAcrossAttempts, Result.Z3QueryTimeMilliseconds);
+ if (Result.IsSAT.has_value())
+ return;
+ }
}
void Z3CrosscheckVisitor::addConstraints(
diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c
index 8ce6184144d4b8..af4527c9fb2823 100644
--- a/clang/test/Analysis/analyzer-config.c
+++ b/clang/test/Analysis/analyzer-config.c
@@ -41,6 +41,7 @@
// CHECK-NEXT: cplusplus.SmartPtrModeling:ModelSmartPtrDereference = false
// CHECK-NEXT: crosscheck-with-z3 = false
// CHECK-NEXT: crosscheck-with-z3-eqclass-timeout-threshold = 0
+// CHECK-NEXT: crosscheck-with-z3-retries-on-timeout = 2
// CHECK-NEXT: crosscheck-with-z3-rlimit-threshold = 0
// CHECK-NEXT: crosscheck-with-z3-timeout-threshold = 15000
// CHECK-NEXT: ctu-dir = ""
diff --git a/clang/test/Analysis/z3-crosscheck-retry.cpp b/clang/test/Analysis/z3-crosscheck-retry.cpp
new file mode 100644
index 00000000000000..b81d3469b3dd26
--- /dev/null
+++ b/clang/test/Analysis/z3-crosscheck-retry.cpp
@@ -0,0 +1,39 @@
+// Check the default config.
+// RUN: %clang_analyze_cc1 -analyzer-checker=debug.ConfigDumper 2>&1 \
+// RUN: | FileCheck %s --match-full-lines
+// CHECK: crosscheck-with-z3-retries-on-timeout = 2
+
+// RUN: rm -rf %t && mkdir %t
+// RUN: %host_cxx -shared -fPIC \
+// RUN: %S/z3/Inputs/MockZ3_solver_check.cpp \
+// RUN: -o %t/MockZ3_solver_check.so
+
+// DEFINE: %{mocked_clang} = \
+// DEFINE: LD_PRELOAD="%t/MockZ3_solver_check.so" \
+// DEFINE: %clang_cc1 %s -analyze -setup-static-analyzer \
+// DEFINE: -analyzer-config crosscheck-with-z3=true \
+// DEFINE: -analyzer-checker=core
+
+// DEFINE: %{retry} = -analyzer-config crosscheck-with-z3-retries-on-timeout
+
+// RUN: Z3_SOLVER_RESULTS="UNDEF" %{mocked_clang} %{retry}=0 -verify=refuted
+// RUN: Z3_SOLVER_RESULTS="UNSAT" %{mocked_clang} %{retry}=0 -verify=refuted
+// RUN: Z3_SOLVER_RESULTS="SAT" %{mocked_clang} %{retry}=0 -verify=accepted
+
+// RUN: Z3_SOLVER_RESULTS="UNDEF,UNDEF" %{mocked_clang} %{retry}=1 -verify=refuted
+// RUN: Z3_SOLVER_RESULTS="UNDEF,UNSAT" %{mocked_clang} %{retry}=1 -verify=refuted
+// RUN: Z3_SOLVER_RESULTS="UNDEF,SAT" %{mocked_clang} %{retry}=1 -verify=accepted
+
+// RUN: Z3_SOLVER_RESULTS="UNDEF,UNDEF,UNDEF" %{mocked_clang} %{retry}=2 -verify=refuted
+// RUN: Z3_SOLVER_RESULTS="UNDEF,UNDEF,UNSAT" %{mocked_clang} %{retry}=2 -verify=refuted
+// RUN: Z3_SOLVER_RESULTS="UNDEF,UNDEF,SAT" %{mocked_clang} %{retry}=2 -verify=accepted
+
+
+// REQUIRES: z3, asserts, shell, system-linux
+
+// refuted-no-diagnostics
+
+int div_by_zero_test(int b) {
+ if (b) {}
+ return 100 / b; // accepted-warning {{Division by zero}}
+}
diff --git a/clang/test/Analysis/z3/D83660.c b/clang/test/Analysis/z3/D83660.c
index fd463333a51e38..b42d934bcc9e7b 100644
--- a/clang/test/Analysis/z3/D83660.c
+++ b/clang/test/Analysis/z3/D83660.c
@@ -1,21 +1,18 @@
// RUN: rm -rf %t && mkdir %t
-// RUN: %host_cxx -shared -fPIC %S/Inputs/MockZ3_solver_check.c -o %t/MockZ3_solver_check.so
+// RUN: %host_cxx -shared -fPIC \
+// RUN: %S/Inputs/MockZ3_solver_check.cpp \
+// RUN: -o %t/MockZ3_solver_check.so
//
-// RUN: LD_PRELOAD="%t/MockZ3_solver_check.so" \
-// RUN: %clang_cc1 -analyze -analyzer-constraints=z3 -setup-static-analyzer \
-// RUN: -analyzer-checker=core,debug.ExprInspection %s -verify 2>&1 | FileCheck %s
+// RUN: Z3_SOLVER_RESULTS="SAT,SAT,SAT,SAT,UNDEF" \
+// RUN: LD_PRELOAD="%t/MockZ3_solver_check.so" \
+// RUN: %clang_cc1 -analyze -analyzer-constraints=z3 -setup-static-analyzer \
+// RUN: -analyzer-checker=core %s -verify
//
// REQUIRES: z3, asserts, shell, system-linux
//
// Works only with the z3 constraint manager.
// expected-no-diagnostics
-// CHECK: Z3_solver_check returns the real value: TRUE
-// CHECK-NEXT: Z3_solver_check returns the real value: TRUE
-// CHECK-NEXT: Z3_solver_check returns the real value: TRUE
-// CHECK-NEXT: Z3_solver_check returns the real value: TRUE
-// CHECK-NEXT: Z3_solver_check returns a mocked value: UNDEF
-
void D83660(int b) {
if (b) {
}
diff --git a/clang/test/Analysis/z3/Inputs/MockZ3_solver_check.c b/clang/test/Analysis/z3/Inputs/MockZ3_solver_check.c
deleted file mode 100644
index 9c63a64ada2201..00000000000000
--- a/clang/test/Analysis/z3/Inputs/MockZ3_solver_check.c
+++ /dev/null
@@ -1,28 +0,0 @@
-#include <dlfcn.h>
-#include <stdio.h>
-
-#include <z3.h>
-
-// Mock implementation: return UNDEF for the 5th invocation, otherwise it just
-// returns the result of the real invocation.
-Z3_lbool Z3_API Z3_solver_check(Z3_context c, Z3_solver s) {
- static Z3_lbool (*OriginalFN)(Z3_context, Z3_solver);
- if (!OriginalFN) {
- OriginalFN = (Z3_lbool(*)(Z3_context, Z3_solver))dlsym(RTLD_NEXT,
- "Z3_solver_check");
- }
-
- // Invoke the actual solver.
- Z3_lbool Result = OriginalFN(c, s);
-
- // Mock the 5th invocation to return UNDEF.
- static unsigned int Counter = 0;
- if (++Counter == 5) {
- fprintf(stderr, "Z3_solver_check returns a mocked value: UNDEF\n");
- return Z3_L_UNDEF;
- }
- fprintf(stderr, "Z3_solver_check returns the real value: %s\n",
- (Result == Z3_L_UNDEF ? "UNDEF"
- : ((Result == Z3_L_TRUE ? "TRUE" : "FALSE"))));
- return Result;
-}
diff --git a/clang/test/Analysis/z3/Inputs/MockZ3_solver_check.cpp b/clang/test/Analysis/z3/Inputs/MockZ3_solver_check.cpp
new file mode 100644
index 00000000000000..54cd86d0ac32c0
--- /dev/null
+++ b/clang/test/Analysis/z3/Inputs/MockZ3_solver_check.cpp
@@ -0,0 +1,62 @@
+#include <cassert>
+#include <dlfcn.h>
+#include <cstdio>
+#include <cstdlib>
+#include <cstring>
+
+#include <z3.h>
+
+static char *Z3ResultsBegin;
+static char *Z3ResultsCursor;
+
+static __attribute__((constructor)) void init() {
+ const char *Env = getenv("Z3_SOLVER_RESULTS");
+ if (!Env) {
+ fprintf(stderr, "Z3_SOLVER_RESULTS envvar must be defined; abort\n");
+ abort();
+ }
+ Z3ResultsBegin = strdup(Env);
+ Z3ResultsCursor = Z3ResultsBegin;
+ if (!Z3ResultsBegin) {
+ fprintf(stderr, "strdup failed; abort\n");
+ abort();
+ }
+}
+
+static __attribute__((destructor)) void finit() {
+ if (strlen(Z3ResultsCursor) > 0) {
+ fprintf(stderr, "Z3_SOLVER_RESULTS should have been completely consumed "
+ "by the end of the test; abort\n");
+ abort();
+ }
+ free(Z3ResultsBegin);
+}
+
+static bool consume_token(char **pointer_to_cursor, const char *token) {
+ assert(pointer_to_cursor);
+ int len = strlen(token);
+ if (*pointer_to_cursor && strncmp(*pointer_to_cursor, token, len) == 0) {
+ *pointer_to_cursor += len;
+ return true;
+ }
+ return false;
+}
+
+Z3_lbool Z3_API Z3_solver_check(Z3_context c, Z3_solver s) {
+ consume_token(&Z3ResultsCursor, ",");
+
+ if (consume_token(&Z3ResultsCursor, "UNDEF")) {
+ printf("Z3_solver_check returns UNDEF\n");
+ return Z3_L_UNDEF;
+ }
+ if (consume_token(&Z3ResultsCursor, "SAT")) {
+ printf("Z3_solver_check returns SAT\n");
+ return Z3_L_TRUE;
+ }
+ if (consume_token(&Z3ResultsCursor, "UNSAT")) {
+ printf("Z3_solver_check returns UNSAT\n");
+ return Z3_L_FALSE;
+ }
+ fprintf(stderr, "Z3_SOLVER_RESULTS was exhausted; abort\n");
+ abort();
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit surprised by the idea of using multiple attempts instead of a single run with a larger timeout -- intuitively we're wasting the already performed calculations if we are impatient and abort+restart the calculations after each short timeout (instead of allocating all the time for a single run).
I can imagine a world where the runtime of the probability distribution of Z3 queries has a very "large tail", and in that case the restarts would be useful. E.g. as a toy example consider a probability distribution where rerunning one particular query takes <200 ms with 90% chance and 2000 ms in the remaining 10% of runs -- in this world doing three runs with 300 ms timeout is much better better than doing one run with 900 ms timeout.
(For more context, see e.g. wikipedia on the memorylessness property of distributions. In a memoryless distribution the expected remaining time is independent of the time that's already spent on waiting; if the distribution is even more skewed than that, then there are points when starting a new calculation is better than continuing the already running one.)
However, I don't see any reason to assume that the distribution has a "long tail" -- in fact eyeballing the graphs from your earlier measurements I'd guess that the outlier slow queries are relatively rare.
To check this question, it would be nice to have measurements that compare the total analysis time and flakiness of e.g.
To simplify the comparison, I'd suggest doing these comparisons without specifying an aggregate timeout -- but if you can suggest a reasonable value for it, then I'm also fine with that.
I moved your question to the RFC, as it is more of a higher level question. I'll come back to the minor remarks once we settle in the RFC. |
…-on-timeout" times If we have a refutation Z3 query timed out (UNDEF), allow a couple of retries to improve stability of the query. By default allow 2 retries. Retries should help mitigating flaky Z3 queries. See the details in the following RFC: https://discourse.llvm.org/t/analyzer-rfc-retry-z3-crosscheck-queries-on-timeout/83711 Note that with each retry, we spend more time per query. Currently, we have a 15 seconds timeout per query - which are also in effect for the retry attempts. --- Why should this help? In short, retrying queries should bring stability because if a query runs long it's more likely that it did so due to some runtime anomaly than it's on the edge of succeeding. This is because most queries run quick, and the queries that run long, usually run long by a fair amount. Consequently, retries should improve the stability of the outcome of the Z3 query. In general, the retries shouldn't increase the overall analysis time because it's really rare we hit the 0.1% of the cases when we would do retries. But keep in mind that the retry attempts can add up if many retries are allowed, or the individual query timeout is large. CPP-5920
I adjusted the first commit message to highlight my answer to these questions. I hope that's clear enough.
I think we discussed this under the RFC, but I want to make sure you confirm this before I'd discard this point. @NagyDonat I think I'm ready for the next round of review. |
Ok, I'd say that this explanation is good enough to claim that "this change may be helpful" -- which is sufficient to merge this change, because it's conservative, non-disruptive and close to being an NFC change. (The only cost is that it increases code complexity, which is a pity IMO, but isn't a high priority problem...) I think it would be good to include this reasoning in the final commit message of the merged commit that will be eventually placed on the main branch (unless you're strongly opposed to this).
As I said on discourse, think that the measurements (that I know of) are too noisy:
However, I understand if you don't want to do more measurements (and want to merge this based on your good intuitions + the knowledge that it doesn't cause problems), then I'm not opposed and ready to accept this PR.
Thanks for the updates! My only remaining open question is the bikeshedding about the name of the option, where I explained my position in an inline comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really grateful that you implemented my request about defining this option as a positive value; especially considering that it ended up being more work than what I expected. (I guessed that we must already had precedent(s) for positive (or at least otherwise constrained) options; while I see that you're building support for this within this commit.)
The change looks good to me, except for an off-by-one error that I mark with an inline comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the updates!
If the updated test passes, feel free to merge it.
The uploaded tests never run in the CI due to the Thanks for the review. |
The StringRef is already transitively included from the AnalyzerOptions.h
I find this really unfortunate that I had to patch the other places where we initialize the AnalyzerOptions table. |
My first idea is a "down to earth", not so modern, but more backwards compatible approach:
Unlike your approach, this solution has a short inelegant phase in the execution where the invalid value 0 is stored in
WDYT? |
This simplifies #120239 Addresses my comment at: #120239 (comment) CPP-5920
…#121910) This simplifies #120239 Addresses my comment at: llvm/llvm-project#120239 (comment) CPP-5920
If we have a refutation Z3 query timed out (UNDEF), allow a couple of
retries to improve stability of the query. By default allow 2 retries,
which will give us in maximum of 3 solve attempts per query.
Retries should help mitigating flaky Z3 queries.
See the details in the following RFC:
https://discourse.llvm.org/t/analyzer-rfc-retry-z3-crosscheck-queries-on-timeout/83711
Note that with each attempt, we spend more time per query.
Currently, we have a 15 seconds timeout per query - which are also in
effect for the retry attempts.
Why should this help?
In short, retrying queries should bring stability because if a query runs long
it's more likely that it did so due to some runtime anomaly than it's on
the edge of succeeding. This is because most queries run quick, and the
queries that run long, usually run long by a fair amount.
Consequently, retries should improve the stability of the outcome of the Z3 query.
In general, the retries shouldn't increase the overall analysis time
because it's really rare we hit the 0.1% of the cases when we would do
retries. But keep in mind that the retry attempts can add up if many
retries are allowed, or the individual query timeout is large.
CPP-5920