Skip to content

[analyzer] Set and display CSA analysis entry points as notes on debugging #84823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion clang/include/clang/Analysis/PathDiagnostic.h
Original file line number Diff line number Diff line change
Expand Up @@ -780,6 +780,9 @@ class PathDiagnostic : public llvm::FoldingSetNode {
PathDiagnosticLocation UniqueingLoc;
const Decl *UniqueingDecl;

/// The top-level entry point from which this issue was discovered.
const Decl *AnalysisEntryPoint = nullptr;

/// Lines executed in the path.
std::unique_ptr<FilesToLineNumsMap> ExecutedLines;

Expand All @@ -788,7 +791,7 @@ class PathDiagnostic : public llvm::FoldingSetNode {
PathDiagnostic(StringRef CheckerName, const Decl *DeclWithIssue,
StringRef bugtype, StringRef verboseDesc, StringRef shortDesc,
StringRef category, PathDiagnosticLocation LocationToUnique,
const Decl *DeclToUnique,
const Decl *DeclToUnique, const Decl *AnalysisEntryPoint,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a tiny bit worried that decl-with-issue and uniquing-location/decl are already too much to ask from the client of this interface (cf. #79398 and #80371), maybe we shouldn't be adding more stuff to this interface. It looks like you have a nice solution that doesn't force each checker to fill it in manually. But if we, say, plug clang-tidy into this interface (a long-term dream of mine that never materialized unfortunately, and that's entirely on me) then they'll have a lot of trouble filling this in.

Given that you seem to be mostly interested in path-sensitive reports, could it be easier to simply extract the decl simply pull the decl from the first path piece? (Yes, we have a Decl in every path piece.) (I'm pretty sure the first piece is always in the entry-point function, because at the very least you'll have a "Calling..." event before you inline a call.) (Or just take any path piece and find the top-level location context.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had similar concerns as you, for similar reasons.

I'm indeed only interested in CSA issues. I also want to support the check::ASTCodeBody callback, from checkers can only emit BasicBugReports, and not path sensitive ones - but notice that its fairly easy to get the decl-with-issue for them, it's D.
Due to this, and for the fact that we have some downstream legacy checkers that does something like this and only emits BasicBugReports, I opted for taking this in the ctor.

Do you have specific suggestions or scenarios to consider?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, this is a debug feature. Debug features are great but it's probably ok to have them slightly incorrect or incomplete if it means that they're no longer associated with increased complexity, with paying for something we don't use.

Out of those AST checkers, how many are emitting diagnostics outside of the entry function / decl-with-issue function? If you append

[debug] This bubble's decl name is 'foo()'

to every message bubble, would that entirely cover all your needs?

I can easily see how there could be a checker that always warns outside of the entry function. If that's the case then it's probably incredibly useful for debugging to quickly know which entry function inspired the warning. But at the same time I'm not sure I can think of a good example when the same information wouldn't be useful for the user as well; maybe the users could benefit from a user-facing note too? (Somewhat similarly to how Clang explains template instantiation stack when diagnosing problems in individual template instantiations. (Something we cannot mimic in static analysis tools because the instantiation stack is already lost.))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you have a nice solution that doesn't force each checker to fill it in manually.

TBH I'm pretty discontent with my implementation. This was basically the third attempt and the so far the cleanest - but it's still bad. The whole diagnostic pipeline is just a mess.

I mean, this is a debug feature. Debug features are great but it's probably ok to have them slightly incorrect or incomplete if it means that they're no longer associated with increased complexity, with paying for something we don't use.

Do you imply I should relax the PathDiagnostic ctor precondition to allow nullptr entry point decls?
From the name Path I judged, the entry point must be always present.

Out of those AST checkers, how many are emitting diagnostics outside of the entry function / decl-with-issue function? If you append

[debug] This bubble's decl name is 'foo()'

to every message bubble, would that entirely cover all your needs?

I'm not sure what "bubble" refers to, but if it's a synonym for "note" or "message", then yes - plus:
The entry point decl would be also beneficial for a primitive caching mechanism to know what reports we need to invalidate if a given decl changes across subsequent static analyzer runs on different git commits. I'll present the idea at EuroLLVM next week, but the gist of it is basically that we associate each diagnostic with a decl, hash that decl; and if subsequent runs hash that decl to the same value, then we can replay the diagnostic and skipping the entry point. To achieve this, we of course need to know the entry point.

I can easily see how there could be a checker that always warns outside of the entry function. If that's the case then it's probably incredibly useful for debugging to quickly know which entry function inspired the warning. But at the same time I'm not sure I can think of a good example when the same information wouldn't be useful for the user as well; maybe the users could benefit from a user-facing note too? (Somewhat similarly to how Clang explains template instantiation stack when diagnosing problems in individual template instantiations. (Something we cannot mimic in static analysis tools because the instantiation stack is already lost.))

I'm natural for having these notes (or a slightly differently phrased notes) denoting the start of the analysis.
If you think it would be genuinely useful, I could add it.

std::unique_ptr<FilesToLineNumsMap> ExecutedLines);
~PathDiagnostic();

Expand Down Expand Up @@ -852,6 +855,9 @@ class PathDiagnostic : public llvm::FoldingSetNode {
return *ExecutedLines;
}

/// Get the top-level entry point from which this issue was discovered.
const Decl *getAnalysisEntryPoint() const { return AnalysisEntryPoint; }

/// Return the semantic context where an issue occurred. If the
/// issue occurs along a path, this represents the "central" area
/// where the bug manifests.
Expand Down
3 changes: 3 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -6688,6 +6688,9 @@ def analyzer_opt_analyze_headers : Flag<["-"], "analyzer-opt-analyze-headers">,
def analyzer_display_progress : Flag<["-"], "analyzer-display-progress">,
HelpText<"Emit verbose output about the analyzer's progress">,
MarshallingInfoFlag<AnalyzerOpts<"AnalyzerDisplayProgress">>;
def analyzer_note_analysis_entry_points : Flag<["-"], "analyzer-note-analysis-entry-points">,
HelpText<"Add a note for each bug report to denote their analysis entry points">,
MarshallingInfoFlag<AnalyzerOpts<"AnalyzerNoteAnalysisEntryPoints">>;
def analyze_function : Separate<["-"], "analyze-function">,
HelpText<"Run analysis on specific function (for C++ include parameters in name)">,
MarshallingInfoString<AnalyzerOpts<"AnalyzeSpecificFunction">>;
Expand Down
9 changes: 5 additions & 4 deletions clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,7 @@ class AnalyzerOptions : public RefCountedBase<AnalyzerOptions> {
unsigned ShouldEmitErrorsOnInvalidConfigValue : 1;
unsigned AnalyzeAll : 1;
unsigned AnalyzerDisplayProgress : 1;
unsigned AnalyzerNoteAnalysisEntryPoints : 1;

unsigned eagerlyAssumeBinOpBifurcation : 1;

Expand Down Expand Up @@ -291,10 +292,10 @@ class AnalyzerOptions : public RefCountedBase<AnalyzerOptions> {
ShowCheckerOptionDeveloperList(false), ShowEnabledCheckerList(false),
ShowConfigOptionsList(false),
ShouldEmitErrorsOnInvalidConfigValue(false), AnalyzeAll(false),
AnalyzerDisplayProgress(false), eagerlyAssumeBinOpBifurcation(false),
TrimGraph(false), visualizeExplodedGraphWithGraphViz(false),
UnoptimizedCFG(false), PrintStats(false), NoRetryExhausted(false),
AnalyzerWerror(false) {}
AnalyzerDisplayProgress(false), AnalyzerNoteAnalysisEntryPoints(false),
eagerlyAssumeBinOpBifurcation(false), TrimGraph(false),
visualizeExplodedGraphWithGraphViz(false), UnoptimizedCFG(false),
PrintStats(false), NoRetryExhausted(false), AnalyzerWerror(false) {}

/// Interprets an option's string value as a boolean. The "true" string is
/// interpreted as true and the "false" string is interpreted as false.
Expand Down
12 changes: 12 additions & 0 deletions clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
Original file line number Diff line number Diff line change
Expand Up @@ -586,6 +586,9 @@ class BugReporter {
private:
BugReporterData& D;

/// The top-level entry point for the issue to be reported.
const Decl *AnalysisEntryPoint = nullptr;

/// Generate and flush the diagnostics for the given bug report.
void FlushReport(BugReportEquivClass& EQ);

Expand Down Expand Up @@ -623,6 +626,14 @@ class BugReporter {

Preprocessor &getPreprocessor() { return D.getPreprocessor(); }

/// Get the top-level entry point for the issue to be reported.
const Decl *getAnalysisEntryPoint() const { return AnalysisEntryPoint; }

void setAnalysisEntryPoint(const Decl *EntryPoint) {
assert(EntryPoint);
AnalysisEntryPoint = EntryPoint;
}

/// Add the given report to the set of reports tracked by BugReporter.
///
/// The reports are usually generated by the checkers. Further, they are
Expand Down Expand Up @@ -713,6 +724,7 @@ class BugReporterContext {
virtual ~BugReporterContext() = default;

PathSensitiveBugReporter& getBugReporter() { return BR; }
const PathSensitiveBugReporter &getBugReporter() const { return BR; }

ProgramStateManager& getStateManager() const {
return BR.getStateManager();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,8 @@ class ExprEngine {

/// Returns true if there is still simulation state on the worklist.
bool ExecuteWorkList(const LocationContext *L, unsigned Steps = 150000) {
assert(L->inTopFrame());
BR.setAnalysisEntryPoint(L->getDecl());
return Engine.ExecuteWorkList(L, Steps, nullptr);
}

Expand Down
7 changes: 5 additions & 2 deletions clang/lib/Analysis/PathDiagnostic.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -115,14 +115,17 @@ PathDiagnostic::PathDiagnostic(
StringRef CheckerName, const Decl *declWithIssue, StringRef bugtype,
StringRef verboseDesc, StringRef shortDesc, StringRef category,
PathDiagnosticLocation LocationToUnique, const Decl *DeclToUnique,
const Decl *AnalysisEntryPoint,
std::unique_ptr<FilesToLineNumsMap> ExecutedLines)
: CheckerName(CheckerName), DeclWithIssue(declWithIssue),
BugType(StripTrailingDots(bugtype)),
VerboseDesc(StripTrailingDots(verboseDesc)),
ShortDesc(StripTrailingDots(shortDesc)),
Category(StripTrailingDots(category)), UniqueingLoc(LocationToUnique),
UniqueingDecl(DeclToUnique), ExecutedLines(std::move(ExecutedLines)),
path(pathImpl) {}
UniqueingDecl(DeclToUnique), AnalysisEntryPoint(AnalysisEntryPoint),
ExecutedLines(std::move(ExecutedLines)), path(pathImpl) {
assert(AnalysisEntryPoint);
}

void PathDiagnosticConsumer::anchor() {}

Expand Down
36 changes: 26 additions & 10 deletions clang/lib/StaticAnalyzer/Core/BugReporter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,8 @@ class PathDiagnosticConstruct {
public:
PathDiagnosticConstruct(const PathDiagnosticConsumer *PDC,
const ExplodedNode *ErrorNode,
const PathSensitiveBugReport *R);
const PathSensitiveBugReport *R,
const Decl *AnalysisEntryPoint);

/// \returns the location context associated with the current position in the
/// bug path.
Expand Down Expand Up @@ -1323,24 +1324,26 @@ void PathDiagnosticBuilder::generatePathDiagnosticsForNode(
}

static std::unique_ptr<PathDiagnostic>
generateDiagnosticForBasicReport(const BasicBugReport *R) {
generateDiagnosticForBasicReport(const BasicBugReport *R,
const Decl *AnalysisEntryPoint) {
const BugType &BT = R->getBugType();
return std::make_unique<PathDiagnostic>(
BT.getCheckerName(), R->getDeclWithIssue(), BT.getDescription(),
R->getDescription(), R->getShortDescription(/*UseFallback=*/false),
BT.getCategory(), R->getUniqueingLocation(), R->getUniqueingDecl(),
std::make_unique<FilesToLineNumsMap>());
AnalysisEntryPoint, std::make_unique<FilesToLineNumsMap>());
}

static std::unique_ptr<PathDiagnostic>
generateEmptyDiagnosticForReport(const PathSensitiveBugReport *R,
const SourceManager &SM) {
const SourceManager &SM,
const Decl *AnalysisEntryPoint) {
const BugType &BT = R->getBugType();
return std::make_unique<PathDiagnostic>(
BT.getCheckerName(), R->getDeclWithIssue(), BT.getDescription(),
R->getDescription(), R->getShortDescription(/*UseFallback=*/false),
BT.getCategory(), R->getUniqueingLocation(), R->getUniqueingDecl(),
findExecutedLines(SM, R->getErrorNode()));
AnalysisEntryPoint, findExecutedLines(SM, R->getErrorNode()));
}

static const Stmt *getStmtParent(const Stmt *S, const ParentMap &PM) {
Expand Down Expand Up @@ -1976,10 +1979,11 @@ static void updateExecutedLinesWithDiagnosticPieces(PathDiagnostic &PD) {

PathDiagnosticConstruct::PathDiagnosticConstruct(
const PathDiagnosticConsumer *PDC, const ExplodedNode *ErrorNode,
const PathSensitiveBugReport *R)
const PathSensitiveBugReport *R, const Decl *AnalysisEntryPoint)
: Consumer(PDC), CurrentNode(ErrorNode),
SM(CurrentNode->getCodeDecl().getASTContext().getSourceManager()),
PD(generateEmptyDiagnosticForReport(R, getSourceManager())) {
PD(generateEmptyDiagnosticForReport(R, getSourceManager(),
AnalysisEntryPoint)) {
LCM[&PD->getActivePath()] = ErrorNode->getLocationContext();
}

Expand All @@ -1993,13 +1997,14 @@ PathDiagnosticBuilder::PathDiagnosticBuilder(

std::unique_ptr<PathDiagnostic>
PathDiagnosticBuilder::generate(const PathDiagnosticConsumer *PDC) const {
PathDiagnosticConstruct Construct(PDC, ErrorNode, R);
const Decl *EntryPoint = getBugReporter().getAnalysisEntryPoint();
PathDiagnosticConstruct Construct(PDC, ErrorNode, R, EntryPoint);

const SourceManager &SM = getSourceManager();
const AnalyzerOptions &Opts = getAnalyzerOptions();

if (!PDC->shouldGenerateDiagnostics())
return generateEmptyDiagnosticForReport(R, getSourceManager());
return generateEmptyDiagnosticForReport(R, getSourceManager(), EntryPoint);

// Construct the final (warning) event for the bug report.
auto EndNotes = VisitorsDiagnostics->find(ErrorNode);
Expand Down Expand Up @@ -3123,6 +3128,16 @@ void BugReporter::FlushReport(BugReportEquivClass& EQ) {
Pieces.back()->addFixit(I);

updateExecutedLinesWithDiagnosticPieces(*PD);

// If we are debugging, let's have the entry point as the first note.
if (getAnalyzerOptions().AnalyzerDisplayProgress ||
getAnalyzerOptions().AnalyzerNoteAnalysisEntryPoints) {
const Decl *EntryPoint = getAnalysisEntryPoint();
Pieces.push_front(std::make_shared<PathDiagnosticEventPiece>(
PathDiagnosticLocation{EntryPoint->getLocation(), getSourceManager()},
"[debug] analyzing from " +
AnalysisDeclContext::getFunctionName(EntryPoint)));
}
Consumer->HandlePathDiagnostic(std::move(PD));
}
}
Expand Down Expand Up @@ -3211,7 +3226,8 @@ BugReporter::generateDiagnosticForConsumerMap(
auto *basicReport = cast<BasicBugReport>(exampleReport);
auto Out = std::make_unique<DiagnosticForConsumerMapTy>();
for (auto *Consumer : consumers)
(*Out)[Consumer] = generateDiagnosticForBasicReport(basicReport);
(*Out)[Consumer] =
generateDiagnosticForBasicReport(basicReport, AnalysisEntryPoint);
return Out;
}

Expand Down
4 changes: 3 additions & 1 deletion clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -527,7 +527,8 @@ static void reportAnalyzerFunctionMisuse(const AnalyzerOptions &Opts,

void AnalysisConsumer::runAnalysisOnTranslationUnit(ASTContext &C) {
BugReporter BR(*Mgr);
TranslationUnitDecl *TU = C.getTranslationUnitDecl();
const TranslationUnitDecl *TU = C.getTranslationUnitDecl();
BR.setAnalysisEntryPoint(TU);
if (SyntaxCheckTimer)
SyntaxCheckTimer->startTimer();
checkerMgr->runCheckersOnASTDecl(TU, *Mgr, BR);
Expand Down Expand Up @@ -675,6 +676,7 @@ void AnalysisConsumer::HandleCode(Decl *D, AnalysisMode Mode,

DisplayFunction(D, Mode, IMode);
BugReporter BR(*Mgr);
BR.setAnalysisEntryPoint(D);

if (Mode & AM_Syntax) {
llvm::TimeRecord CheckerStartTime;
Expand Down
42 changes: 33 additions & 9 deletions clang/test/Analysis/analyzer-display-progress.cpp
Original file line number Diff line number Diff line change
@@ -1,22 +1,46 @@
// RUN: %clang_analyze_cc1 -analyzer-display-progress %s 2>&1 | FileCheck %s
// RUN: %clang_analyze_cc1 -verify %s 2>&1 \
// RUN: -analyzer-display-progress \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-output=text \
// RUN: | FileCheck %s

void f() {};
void g() {};
void h() {}
void clang_analyzer_warnIfReached();

// expected-note@+2 {{[debug] analyzing from f()}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void f() { clang_analyzer_warnIfReached(); }

// expected-note@+2 {{[debug] analyzing from g()}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void g() { clang_analyzer_warnIfReached(); }

// expected-note@+2 {{[debug] analyzing from h()}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void h() { clang_analyzer_warnIfReached(); }

struct SomeStruct {
void f() {}
// expected-note@+2 {{[debug] analyzing from SomeStruct::f()}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void f() { clang_analyzer_warnIfReached(); }
};

struct SomeOtherStruct {
void f() {}
// expected-note@+2 {{[debug] analyzing from SomeOtherStruct::f()}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void f() { clang_analyzer_warnIfReached(); }
};

namespace ns {
struct SomeStruct {
void f(int) {}
void f(float, ::SomeStruct) {}
void f(float, SomeStruct) {}
// expected-note@+2 {{[debug] analyzing from ns::SomeStruct::f(int)}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void f(int) { clang_analyzer_warnIfReached(); }
// expected-note@+2 {{[debug] analyzing from ns::SomeStruct::f(float, ::SomeStruct)}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void f(float, ::SomeStruct) { clang_analyzer_warnIfReached(); }
// expected-note@+2 {{[debug] analyzing from ns::SomeStruct::f(float, SomeStruct)}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
void f(float, SomeStruct) { clang_analyzer_warnIfReached(); }
};
}

Expand Down
31 changes: 22 additions & 9 deletions clang/test/Analysis/analyzer-display-progress.m
Original file line number Diff line number Diff line change
@@ -1,30 +1,43 @@
// RUN: %clang_analyze_cc1 -fblocks -analyzer-display-progress %s 2>&1 | FileCheck %s
// RUN: %clang_analyze_cc1 -fblocks -verify %s 2>&1 \
// RUN: -analyzer-display-progress \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-output=text \
// RUN: | FileCheck %s

#include "Inputs/system-header-simulator-objc.h"

static void f(void) {}
void clang_analyzer_warnIfReached();

// expected-note@+2 {{[debug] analyzing from f}}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
static void f(void) { clang_analyzer_warnIfReached(); }

@interface I: NSObject
-(void)instanceMethod:(int)arg1 with:(int)arg2;
+(void)classMethod;
@end

@implementation I
-(void)instanceMethod:(int)arg1 with:(int)arg2 {}
+(void)classMethod {}
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
-(void)instanceMethod:(int)arg1 with:(int)arg2 { clang_analyzer_warnIfReached(); }

// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
+(void)classMethod { clang_analyzer_warnIfReached(); }
@end

// expected-note@+1 3 {{[debug] analyzing from g}}
void g(I *i, int x, int y) {
[I classMethod];
[i instanceMethod: x with: y];
[I classMethod]; // expected-note {{Calling 'classMethod'}}
[i instanceMethod: x with: y]; // expected-note {{Calling 'instanceMethod:with:'}}

void (^block)(void);
block = ^{};
block();
// expected-warning@+1 {{REACHABLE}} expected-note@+1 {{REACHABLE}}
block = ^{ clang_analyzer_warnIfReached(); };
block(); // expected-note {{Calling anonymous block}}
}

// CHECK: analyzer-display-progress.m f
// CHECK: analyzer-display-progress.m -[I instanceMethod:with:]
// CHECK: analyzer-display-progress.m +[I classMethod]
// CHECK: analyzer-display-progress.m g
// CHECK: analyzer-display-progress.m block (line: 22, col: 11)
// CHECK: analyzer-display-progress.m block (line: 35, col: 11)
Loading