-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang] Catch missing format attributes #105479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -3899,7 +3899,7 @@ static bool handleFormatAttrCommon(Sema &S, Decl *D, const ParsedAttr &AL, | |||||
|
||||||
// In C++ the implicit 'this' function parameter also counts, and they are | ||||||
// counted from one. | ||||||
bool HasImplicitThisParam = isInstanceMethod(D); | ||||||
bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D); | ||||||
Info->NumArgs = getFunctionOrMethodNumParams(D) + HasImplicitThisParam; | ||||||
|
||||||
Info->Identifier = AL.getArgAsIdent(0)->Ident; | ||||||
|
@@ -4042,7 +4042,7 @@ static void handleCallbackAttr(Sema &S, Decl *D, const ParsedAttr &AL) { | |||||
return; | ||||||
} | ||||||
|
||||||
bool HasImplicitThisParam = isInstanceMethod(D); | ||||||
bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D); | ||||||
int32_t NumArgs = getFunctionOrMethodNumParams(D); | ||||||
|
||||||
FunctionDecl *FD = D->getAsFunction(); | ||||||
|
@@ -5918,6 +5918,181 @@ static void handlePreferredTypeAttr(Sema &S, Decl *D, const ParsedAttr &AL) { | |||||
D->addAttr(::new (S.Context) PreferredTypeAttr(S.Context, AL, ParmTSI)); | ||||||
} | ||||||
|
||||||
// Diagnosing missing format attributes is implemented in two steps: | ||||||
// 1. Detect missing format attributes while checking function calls. | ||||||
// 2. Emit diagnostic in part that processes function body. | ||||||
// For this purpose it is created vector that stores information about format | ||||||
// attributes. There are no two format attributes with same arguments in a | ||||||
// vector. Vector could contains attributes that only store information about | ||||||
// format type (format string and first to check argument are set to -1). | ||||||
namespace { | ||||||
std::vector<FormatAttr *> MissingAttributes; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Clang does not use global mutable state. You need to find some other way to keep that around. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we need to keep it around at all. The patch wants to avoid emitting the warning when the same parameter is used in different format functions, so as to avoid suggesting to add multiple incompatible attributes. But since different format functions tend to use a different language, this is probably extremely rare. We should of course try to avoid false positives, but in this case it seems rather hypothetical, and it seems to come with a high cost. If we stop worrying about that, we can directly emit the warning when checking a call. |
||||||
} // end anonymous namespace | ||||||
|
||||||
// This function is called only if function call is not inside template body. | ||||||
// TODO: Add call for function calls inside template body. | ||||||
// Detects and stores missing format attributes in a vector. | ||||||
void Sema::DetectMissingFormatAttributes(const FunctionDecl *Callee, | ||||||
ArrayRef<const Expr *> Args, | ||||||
SourceLocation Loc) { | ||||||
assert(Callee); | ||||||
|
||||||
// If there is no caller, exit. | ||||||
const FunctionDecl *Caller = getCurFunctionDecl(); | ||||||
if (!getCurFunctionDecl()) | ||||||
return; | ||||||
|
||||||
// Check if callee function is a format function. | ||||||
// If it is, check if caller function misses format attributes. | ||||||
|
||||||
if (!Callee->hasAttr<FormatAttr>()) | ||||||
return; | ||||||
|
||||||
// va_list is not intended to be passed to variadic function. | ||||||
if (Callee->isVariadic()) | ||||||
return; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that there's just one code base in the world that does it, but Clang supports multiple format attributes on the same function, where one of them has a non-zero format index and any others pass a va_list. See #129954. In that context, bailing out because the function is variadic is too eager. |
||||||
|
||||||
// Check if va_list is passed to callee function. | ||||||
// If va_list is not passed, return. | ||||||
bool hasVaList = false; | ||||||
for (const auto *Param : Callee->parameters()) { | ||||||
if (Param->getOriginalType().getCanonicalType() == | ||||||
getASTContext().getBuiltinVaListType().getCanonicalType()) { | ||||||
hasVaList = true; | ||||||
break; | ||||||
} | ||||||
} | ||||||
if (!hasVaList) | ||||||
return; | ||||||
Comment on lines
+5955
to
+5966
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The attribute is also about checking the format string. So I don't think we should bail out here either. Just forwarding a parameter as format string should be enough to trigger the warning. Only for the fix-it we have to figure out what happens to the arguments. |
||||||
|
||||||
unsigned int FormatArgumentIndexOffset = | ||||||
checkIfMethodHasImplicitObjectParameter(Callee) ? 2 : 1; | ||||||
|
||||||
// If callee function is format function and format arguments are not | ||||||
// relevant to emit diagnostic, save only information about format type | ||||||
// (format index and first-to-check argument index are set to -1). | ||||||
// Information about format type is later used to determine if there are | ||||||
// more than one format type found. | ||||||
|
||||||
unsigned int NumArgs = Args.size(); | ||||||
// Check if function has format attribute with forwarded format string. | ||||||
IdentifierInfo *AttrType; | ||||||
const ParmVarDecl *FormatStringArg; | ||||||
if (!llvm::any_of( | ||||||
Callee->specific_attrs<FormatAttr>(), [&](const FormatAttr *Attr) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd suggest to call this from the loop in |
||||||
AttrType = Attr->getType(); | ||||||
|
||||||
int OffsetFormatIndex = | ||||||
Attr->getFormatIdx() - FormatArgumentIndexOffset; | ||||||
if (OffsetFormatIndex < 0 || (unsigned)OffsetFormatIndex >= NumArgs) | ||||||
return false; | ||||||
Comment on lines
+5987
to
+5988
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can this actually happen? I would have thought we get an error when the index is out of bounds and don't even get here. |
||||||
|
||||||
if (const auto *FormatArgExpr = dyn_cast<DeclRefExpr>( | ||||||
Args[OffsetFormatIndex]->IgnoreParenCasts())) | ||||||
if (FormatStringArg = dyn_cast_or_null<ParmVarDecl>( | ||||||
FormatArgExpr->getReferencedDeclOfCallee())) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is correct for what you're trying to do, but doesn't clang emit a warning for un-parenthesized assignments in if conditions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this not enough? We already have a
Suggested change
|
||||||
return true; | ||||||
return false; | ||||||
})) { | ||||||
MissingAttributes.push_back( | ||||||
FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1)); | ||||||
return; | ||||||
} | ||||||
|
||||||
unsigned ArgumentIndexOffset = | ||||||
checkIfMethodHasImplicitObjectParameter(Caller) ? 2 : 1; | ||||||
|
||||||
unsigned NumOfCallerFunctionParams = Caller->getNumParams(); | ||||||
|
||||||
// Compare caller and callee function format attribute arguments (archetype | ||||||
// and format string). If they don't match, caller misses format attribute. | ||||||
if (llvm::any_of( | ||||||
Caller->specific_attrs<FormatAttr>(), [&](const FormatAttr *Attr) { | ||||||
if (Attr->getType() != AttrType) | ||||||
return false; | ||||||
int OffsetFormatIndex = Attr->getFormatIdx() - ArgumentIndexOffset; | ||||||
|
||||||
if (OffsetFormatIndex < 0 || | ||||||
(unsigned)OffsetFormatIndex >= NumOfCallerFunctionParams) | ||||||
return false; | ||||||
|
||||||
if (Caller->parameters()[OffsetFormatIndex] != FormatStringArg) | ||||||
return false; | ||||||
|
||||||
return true; | ||||||
})) { | ||||||
MissingAttributes.push_back( | ||||||
FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1)); | ||||||
return; | ||||||
} | ||||||
|
||||||
// Get format string index | ||||||
int FormatStringIndex = | ||||||
FormatStringArg->getFunctionScopeIndex() + ArgumentIndexOffset; | ||||||
|
||||||
// Get first argument index | ||||||
int FirstToCheck = Caller->isVariadic() | ||||||
? (NumOfCallerFunctionParams + ArgumentIndexOffset) | ||||||
: 0; | ||||||
|
||||||
// Do not add duplicate in a vector of missing format attributes. | ||||||
if (!llvm::any_of(MissingAttributes, [&](const FormatAttr *Attr) { | ||||||
return Attr->getType() == AttrType && | ||||||
Attr->getFormatIdx() == FormatStringIndex && | ||||||
Attr->getFirstArg() == FirstToCheck; | ||||||
})) | ||||||
MissingAttributes.push_back(FormatAttr::CreateImplicit( | ||||||
getASTContext(), AttrType, FormatStringIndex, FirstToCheck, Loc)); | ||||||
Comment on lines
+6038
to
+6045
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should just not bother doing this and emit the warning right here. |
||||||
} | ||||||
|
||||||
// This function is called only if caller function is not template. | ||||||
// TODO: Add call for template functions. | ||||||
// Emits missing format attribute diagnostics. | ||||||
void Sema::EmitMissingFormatAttributesDiagnostic(const FunctionDecl *Caller) { | ||||||
const clang::IdentifierInfo *AttrType = MissingAttributes[0]->getType(); | ||||||
for (unsigned i = 1; i < MissingAttributes.size(); ++i) { | ||||||
if (AttrType != MissingAttributes[i]->getType()) { | ||||||
// Clear vector of missing attributes because it could be used in | ||||||
// diagnosing missing format attributes in another caller. | ||||||
MissingAttributes.clear(); | ||||||
return; | ||||||
} | ||||||
} | ||||||
|
||||||
for (const FormatAttr *FA : MissingAttributes) { | ||||||
// If format index and first-to-check argument index are negative, it means | ||||||
// that this attribute is only saved for multiple format types checking. | ||||||
if (FA->getFormatIdx() < 0 || FA->getFirstArg() < 0) | ||||||
continue; | ||||||
|
||||||
// If caller function has format attributes and callee format attribute type | ||||||
// mismatches caller attribute type, do not emit diagnostic. | ||||||
if (Caller->hasAttr<FormatAttr>() && | ||||||
!llvm::any_of(Caller->specific_attrs<FormatAttr>(), | ||||||
[FA](const FormatAttr *FunctionAttr) { | ||||||
return FA->getType() == FunctionAttr->getType(); | ||||||
})) | ||||||
continue; | ||||||
|
||||||
// Emit diagnostic | ||||||
SourceLocation Loc = Caller->getFirstDecl()->getLocation(); | ||||||
Diag(Loc, diag::warn_missing_format_attribute) | ||||||
<< FA->getType() << Caller | ||||||
<< FixItHint::CreateInsertion(Loc, | ||||||
(llvm::Twine("__attribute__((format(") + | ||||||
FA->getType()->getName() + ", " + | ||||||
llvm::Twine(FA->getFormatIdx()) + ", " + | ||||||
llvm::Twine(FA->getFirstArg()) + ")))") | ||||||
.str()); | ||||||
Diag(FA->getLocation(), diag::note_format_function) << FA->getType(); | ||||||
} | ||||||
|
||||||
// Clear vector of missing attributes after emitting diagnostics for caller | ||||||
// function because it could be used in diagnosing missing format attributes | ||||||
// in another caller. | ||||||
MissingAttributes.clear(); | ||||||
} | ||||||
|
||||||
//===----------------------------------------------------------------------===// | ||||||
// Microsoft specific attribute handlers. | ||||||
//===----------------------------------------------------------------------===// | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this into the block up there with the comment "Printf and scanf checking"? We care about the same case: calling a function with a
FormatAttr
. Up there is a loop going over these attributes. Why don't we add this call to the loop, processing a single attribute at a time?The reason is to avoid adding complexity in the common case of no such attribute. Up there we already have the check, so we're not adding anything.
And I don't think we lose anything. If anything, this should make the check simpler.
Although in theory I don't see a reason why this is contradictory at all, at least without argument checking: it would simply constrain the format string to strings accepted by multiple formatting languages.
Or, as @apple-fcloutier suggested, just put it into
CheckFormatArguments
.