Skip to content

Commit 527fcb8

Browse files
spaitsGabor Spaitssteakhal
authored
[analyzer] Add std::variant checker (#66481)
As my BSc thesis I've implemented a checker for std::variant and std::any, and in the following weeks I'll upload a revised version of them here. # Prelude @Szelethus and I sent out an email with our initial plans here: https://discourse.llvm.org/t/analyzer-new-checker-for-std-any-as-a-bsc-thesis/65613/2 We also created a stub checker patch here: https://reviews.llvm.org/D142354. Upon the recommendation of @haoNoQ , we explored an option where instead of writing a checker, we tried to improve on how the analyzer natively inlined the methods of std::variant and std::any. Our attempt is in this patch https://reviews.llvm.org/D145069, but in a nutshell, this is what happened: The analyzer was able to model much of what happened inside those classes, but our false positive suppression machinery erroneously suppressed it. After months of trying, we could not find a satisfying enhancement on the heuristic without introducing an allowlist/denylist of which functions to not suppress. As a result (and partly on the encouragement of @Xazax-hun) I wrote a dedicated checker! The advantage of the checker is that it is not dependent on the standard's implementation and won't put warnings in the standard library definitions. Also without the checker it would be difficult to create nice user-friendly warnings and NoteTags -- as per the standard's specification, the analysis is sinked by an exception, which we don't model well now. # Design ideas The working of the checker is straightforward: We find the creation of an std::variant instance, store the type of the variable we want to store in it, then save this type for the instance. When retrieving type from the instance we check what type we want to retrieve as, and compare it to the actual type. If the two don't march we emit an error. Distinguishing variants by instance (e.g. MemRegion *) is not the most optimal way. Other checkers, like MallocChecker uses a symbol-to-trait map instead of region-to-trait. The upside of using symbols (which would be the value of a variant, not the variant itself itself) is that the analyzer would take care of modeling copies, moves, invalidation, etc, out of the box. The problem is that for compound types, the analyzer doesn't create a symbol as a result of a constructor call that is fit for this job. MallocChecker in contrast manipulates simple pointers. My colleges and I considered the option of making adjustments directly to the memory model of the analyzer, but for the time being decided against it, and go with the bit more cumbersome, but immediately viable option of simply using MemRegions. # Current state and review plan This patch contains an already working checker that can find and report certain variant/any misuses, but still lands it in alpha. I plan to upload the rest of the checker in later patches. The full checker is also able to "follow" the symbolic value held by the std::variant and updates the program state whenever we assign the value stored in the variant. I have also built a library that is meant to model union-like types similar to variant, hence some functions being a bit more multipurpose then is immediately needed. I also intend to publish my std::any checker in a later commit. --------- Co-authored-by: Gabor Spaits <[email protected]> Co-authored-by: Balazs Benics <[email protected]>
1 parent 8336bfb commit 527fcb8

File tree

9 files changed

+953
-50
lines changed

9 files changed

+953
-50
lines changed

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,10 @@ def C11LockChecker : Checker<"C11Lock">,
318318
Dependencies<[PthreadLockBase]>,
319319
Documentation<HasDocumentation>;
320320

321+
def StdVariantChecker : Checker<"StdVariant">,
322+
HelpText<"Check for bad type access for std::variant.">,
323+
Documentation<HasDocumentation>;
324+
321325
} // end "alpha.core"
322326

323327
//===----------------------------------------------------------------------===//

clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h

Lines changed: 50 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ enum CallEventKind {
7878

7979
class CallEvent;
8080

81-
template<typename T = CallEvent>
81+
template <typename T = CallEvent>
8282
class CallEventRef : public IntrusiveRefCntPtr<const T> {
8383
public:
8484
CallEventRef(const T *Call) : IntrusiveRefCntPtr<const T>(Call) {}
@@ -94,8 +94,7 @@ class CallEventRef : public IntrusiveRefCntPtr<const T> {
9494

9595
// Allow implicit conversions to a superclass type, since CallEventRef
9696
// behaves like a pointer-to-const.
97-
template <typename SuperT>
98-
operator CallEventRef<SuperT> () const {
97+
template <typename SuperT> operator CallEventRef<SuperT>() const {
9998
return this->get();
10099
}
101100
};
@@ -124,9 +123,9 @@ class RuntimeDefinition {
124123

125124
public:
126125
RuntimeDefinition() = default;
127-
RuntimeDefinition(const Decl *InD): D(InD) {}
126+
RuntimeDefinition(const Decl *InD) : D(InD) {}
128127
RuntimeDefinition(const Decl *InD, bool Foreign) : D(InD), Foreign(Foreign) {}
129-
RuntimeDefinition(const Decl *InD, const MemRegion *InR): D(InD), R(InR) {}
128+
RuntimeDefinition(const Decl *InD, const MemRegion *InR) : D(InD), R(InR) {}
130129

131130
const Decl *getDecl() { return D; }
132131
bool isForeign() const { return Foreign; }
@@ -207,8 +206,9 @@ class CallEvent {
207206

208207
/// Used to specify non-argument regions that will be invalidated as a
209208
/// result of this call.
210-
virtual void getExtraInvalidatedValues(ValueList &Values,
211-
RegionAndSymbolInvalidationTraits *ETraits) const {}
209+
virtual void
210+
getExtraInvalidatedValues(ValueList &Values,
211+
RegionAndSymbolInvalidationTraits *ETraits) const {}
212212

213213
public:
214214
CallEvent &operator=(const CallEvent &) = delete;
@@ -231,14 +231,10 @@ class CallEvent {
231231
void setForeign(bool B) const { Foreign = B; }
232232

233233
/// The state in which the call is being evaluated.
234-
const ProgramStateRef &getState() const {
235-
return State;
236-
}
234+
const ProgramStateRef &getState() const { return State; }
237235

238236
/// The context in which the call is being evaluated.
239-
const LocationContext *getLocationContext() const {
240-
return LCtx;
241-
}
237+
const LocationContext *getLocationContext() const { return LCtx; }
242238

243239
const CFGBlock::ConstCFGElementRef &getCFGElementRef() const {
244240
return ElemRef;
@@ -270,7 +266,7 @@ class CallEvent {
270266
SourceLocation Loc = D->getLocation();
271267
if (Loc.isValid()) {
272268
const SourceManager &SM =
273-
getState()->getStateManager().getContext().getSourceManager();
269+
getState()->getStateManager().getContext().getSourceManager();
274270
return SM.isInSystemHeader(D->getLocation());
275271
}
276272

@@ -324,9 +320,7 @@ class CallEvent {
324320
// NOTE: The exact semantics of this are still being defined!
325321
// We don't really want a list of hardcoded exceptions in the long run,
326322
// but we don't want duplicated lists of known APIs in the short term either.
327-
virtual bool argumentsMayEscape() const {
328-
return hasNonZeroCallbackArg();
329-
}
323+
virtual bool argumentsMayEscape() const { return hasNonZeroCallbackArg(); }
330324

331325
/// Returns true if the callee is an externally-visible function in the
332326
/// top-level namespace, such as \c malloc.
@@ -456,6 +450,14 @@ class CallEvent {
456450
/// can be retrieved from its construction context.
457451
std::optional<SVal> getReturnValueUnderConstruction() const;
458452

453+
// Returns the CallEvent representing the caller of this function
454+
const CallEventRef<> getCaller() const;
455+
456+
// Returns true if the function was called from a standard library function.
457+
// If not or could not get the caller (it may be a top level function)
458+
// returns false.
459+
bool isCalledFromSystemHeader() const;
460+
459461
// Iterator access to formal parameters and their types.
460462
private:
461463
struct GetTypeFn {
@@ -579,8 +581,9 @@ class BlockCall : public CallEvent {
579581

580582
void cloneTo(void *Dest) const override { new (Dest) BlockCall(*this); }
581583

582-
void getExtraInvalidatedValues(ValueList &Values,
583-
RegionAndSymbolInvalidationTraits *ETraits) const override;
584+
void getExtraInvalidatedValues(
585+
ValueList &Values,
586+
RegionAndSymbolInvalidationTraits *ETraits) const override;
584587

585588
public:
586589
const CallExpr *getOriginExpr() const override {
@@ -650,14 +653,12 @@ class BlockCall : public CallEvent {
650653
// the block body and analyze the operator() method on the captured lambda.
651654
const VarDecl *LambdaVD = getRegionStoringCapturedLambda()->getDecl();
652655
const CXXRecordDecl *LambdaDecl = LambdaVD->getType()->getAsCXXRecordDecl();
653-
CXXMethodDecl* LambdaCallOperator = LambdaDecl->getLambdaCallOperator();
656+
CXXMethodDecl *LambdaCallOperator = LambdaDecl->getLambdaCallOperator();
654657

655658
return RuntimeDefinition(LambdaCallOperator);
656659
}
657660

658-
bool argumentsMayEscape() const override {
659-
return true;
660-
}
661+
bool argumentsMayEscape() const override { return true; }
661662

662663
void getInitialStackFrameContents(const StackFrameContext *CalleeCtx,
663664
BindingsTy &Bindings) const override;
@@ -684,8 +685,9 @@ class CXXInstanceCall : public AnyFunctionCall {
684685
: AnyFunctionCall(D, St, LCtx, ElemRef) {}
685686
CXXInstanceCall(const CXXInstanceCall &Other) = default;
686687

687-
void getExtraInvalidatedValues(ValueList &Values,
688-
RegionAndSymbolInvalidationTraits *ETraits) const override;
688+
void getExtraInvalidatedValues(
689+
ValueList &Values,
690+
RegionAndSymbolInvalidationTraits *ETraits) const override;
689691

690692
public:
691693
/// Returns the expression representing the implicit 'this' object.
@@ -843,7 +845,9 @@ class CXXDestructorCall : public CXXInstanceCall {
843845

844846
CXXDestructorCall(const CXXDestructorCall &Other) = default;
845847

846-
void cloneTo(void *Dest) const override {new (Dest) CXXDestructorCall(*this);}
848+
void cloneTo(void *Dest) const override {
849+
new (Dest) CXXDestructorCall(*this);
850+
}
847851

848852
public:
849853
SourceRange getSourceRange() const override { return Location; }
@@ -880,8 +884,9 @@ class AnyCXXConstructorCall : public AnyFunctionCall {
880884
Data = Target;
881885
}
882886

883-
void getExtraInvalidatedValues(ValueList &Values,
884-
RegionAndSymbolInvalidationTraits *ETraits) const override;
887+
void getExtraInvalidatedValues(
888+
ValueList &Values,
889+
RegionAndSymbolInvalidationTraits *ETraits) const override;
885890

886891
void getInitialStackFrameContents(const StackFrameContext *CalleeCtx,
887892
BindingsTy &Bindings) const override;
@@ -921,7 +926,9 @@ class CXXConstructorCall : public AnyCXXConstructorCall {
921926

922927
CXXConstructorCall(const CXXConstructorCall &Other) = default;
923928

924-
void cloneTo(void *Dest) const override { new (Dest) CXXConstructorCall(*this); }
929+
void cloneTo(void *Dest) const override {
930+
new (Dest) CXXConstructorCall(*this);
931+
}
925932

926933
public:
927934
const CXXConstructExpr *getOriginExpr() const override {
@@ -1040,7 +1047,9 @@ class CXXAllocatorCall : public AnyFunctionCall {
10401047
: AnyFunctionCall(E, St, LCtx, ElemRef) {}
10411048
CXXAllocatorCall(const CXXAllocatorCall &Other) = default;
10421049

1043-
void cloneTo(void *Dest) const override { new (Dest) CXXAllocatorCall(*this); }
1050+
void cloneTo(void *Dest) const override {
1051+
new (Dest) CXXAllocatorCall(*this);
1052+
}
10441053

10451054
public:
10461055
const CXXNewExpr *getOriginExpr() const override {
@@ -1154,11 +1163,7 @@ class CXXDeallocatorCall : public AnyFunctionCall {
11541163
//
11551164
// Note to maintainers: OCM_Message should always be last, since it does not
11561165
// need to fit in the Data field's low bits.
1157-
enum ObjCMessageKind {
1158-
OCM_PropertyAccess,
1159-
OCM_Subscript,
1160-
OCM_Message
1161-
};
1166+
enum ObjCMessageKind { OCM_PropertyAccess, OCM_Subscript, OCM_Message };
11621167

11631168
/// Represents any expression that calls an Objective-C method.
11641169
///
@@ -1180,8 +1185,9 @@ class ObjCMethodCall : public CallEvent {
11801185

11811186
void cloneTo(void *Dest) const override { new (Dest) ObjCMethodCall(*this); }
11821187

1183-
void getExtraInvalidatedValues(ValueList &Values,
1184-
RegionAndSymbolInvalidationTraits *ETraits) const override;
1188+
void getExtraInvalidatedValues(
1189+
ValueList &Values,
1190+
RegionAndSymbolInvalidationTraits *ETraits) const override;
11851191

11861192
/// Check if the selector may have multiple definitions (may have overrides).
11871193
virtual bool canBeOverridenInSubclass(ObjCInterfaceDecl *IDecl,
@@ -1196,9 +1202,7 @@ class ObjCMethodCall : public CallEvent {
11961202
return getOriginExpr()->getMethodDecl();
11971203
}
11981204

1199-
unsigned getNumArgs() const override {
1200-
return getOriginExpr()->getNumArgs();
1201-
}
1205+
unsigned getNumArgs() const override { return getOriginExpr()->getNumArgs(); }
12021206

12031207
const Expr *getArgExpr(unsigned Index) const override {
12041208
return getOriginExpr()->getArg(Index);
@@ -1212,9 +1216,7 @@ class ObjCMethodCall : public CallEvent {
12121216
return getOriginExpr()->getMethodFamily();
12131217
}
12141218

1215-
Selector getSelector() const {
1216-
return getOriginExpr()->getSelector();
1217-
}
1219+
Selector getSelector() const { return getOriginExpr()->getSelector(); }
12181220

12191221
SourceRange getSourceRange() const override;
12201222

@@ -1262,7 +1264,7 @@ class ObjCMethodCall : public CallEvent {
12621264
void getInitialStackFrameContents(const StackFrameContext *CalleeCtx,
12631265
BindingsTy &Bindings) const override;
12641266

1265-
ArrayRef<ParmVarDecl*> parameters() const override;
1267+
ArrayRef<ParmVarDecl *> parameters() const override;
12661268

12671269
Kind getKind() const override { return CE_ObjCMessage; }
12681270
StringRef getKindAsString() const override { return "ObjCMethodCall"; }
@@ -1336,8 +1338,8 @@ class CallEventManager {
13361338
CallEventManager(llvm::BumpPtrAllocator &alloc) : Alloc(alloc) {}
13371339

13381340
/// Gets an outside caller given a callee context.
1339-
CallEventRef<>
1340-
getCaller(const StackFrameContext *CalleeCtx, ProgramStateRef State);
1341+
CallEventRef<> getCaller(const StackFrameContext *CalleeCtx,
1342+
ProgramStateRef State);
13411343

13421344
/// Gets a call event for a function call, Objective-C method call,
13431345
/// a 'new', or a 'delete' call.
@@ -1433,11 +1435,10 @@ inline void CallEvent::Release() const {
14331435
namespace llvm {
14341436

14351437
// Support isa<>, cast<>, and dyn_cast<> for CallEventRef.
1436-
template<class T> struct simplify_type< clang::ento::CallEventRef<T>> {
1438+
template <class T> struct simplify_type<clang::ento::CallEventRef<T>> {
14371439
using SimpleType = const T *;
14381440

1439-
static SimpleType
1440-
getSimplifiedValue(clang::ento::CallEventRef<T> Val) {
1441+
static SimpleType getSimplifiedValue(clang::ento::CallEventRef<T> Val) {
14411442
return Val.get();
14421443
}
14431444
};

clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@ add_clang_library(clangStaticAnalyzerCheckers
108108
SmartPtrModeling.cpp
109109
StackAddrEscapeChecker.cpp
110110
StdLibraryFunctionsChecker.cpp
111+
StdVariantChecker.cpp
111112
STLAlgorithmModeling.cpp
112113
StreamChecker.cpp
113114
StringChecker.cpp

0 commit comments

Comments
 (0)