Skip to content

[clang] Catch missing format attributes #105479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,9 @@ Improvements to Clang's diagnostics
- Now correctly diagnose a tentative definition of an array with static
storage duration in pedantic mode in C. (#GH50661)

- Clang now diagnoses missing format attributes for non-template functions
and class/struct/union members. (#GH60718)

Improvements to Clang's time-trace
----------------------------------

Expand Down
1 change: 0 additions & 1 deletion clang/include/clang/Basic/DiagnosticGroups.td
Original file line number Diff line number Diff line change
Expand Up @@ -550,7 +550,6 @@ def MainReturnType : DiagGroup<"main-return-type">;
def MaxUnsignedZero : DiagGroup<"max-unsigned-zero">;
def MissingBraces : DiagGroup<"missing-braces">;
def MissingDeclarations: DiagGroup<"missing-declarations">;
def : DiagGroup<"missing-format-attribute">;
def MissingIncludeDirs : DiagGroup<"missing-include-dirs">;
def MissingNoreturn : DiagGroup<"missing-noreturn">;
def MultiChar : DiagGroup<"multichar">;
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -1099,6 +1099,10 @@ def err_opencl_invalid_param : Error<
"declaring function parameter of type %0 is not allowed%select{; did you forget * ?|}1">;
def err_opencl_invalid_return : Error<
"declaring function return value of type %0 is not allowed %select{; did you forget * ?|}1">;
def warn_missing_format_attribute : Warning<
"diagnostic behavior may be improved by adding the %0 format attribute to the declaration of %1">,
InGroup<DiagGroup<"missing-format-attribute">>, DefaultIgnore;
def note_format_function : Note<"%0 format function">;
def warn_pragma_options_align_reset_failed : Warning<
"#pragma options align=reset failed: %0">,
InGroup<IgnoredPragmas>;
Expand Down
7 changes: 7 additions & 0 deletions clang/include/clang/Sema/Attr.h
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,13 @@ inline bool isInstanceMethod(const Decl *D) {
return false;
}

inline bool checkIfMethodHasImplicitObjectParameter(const Decl *D) {
if (const auto *MethodDecl = dyn_cast<CXXMethodDecl>(D))
return MethodDecl->isInstance() &&
!MethodDecl->hasCXXExplicitFunctionObjectParameter();
return false;
}

/// Diagnose mutually exclusive attributes when present on a given
/// declaration. Returns true if diagnosed.
template <typename AttrTy>
Expand Down
5 changes: 5 additions & 0 deletions clang/include/clang/Sema/Sema.h
Original file line number Diff line number Diff line change
Expand Up @@ -4713,6 +4713,11 @@ class Sema final : public SemaBase {

enum class RetainOwnershipKind { NS, CF, OS };

void DetectMissingFormatAttributes(const FunctionDecl *Callee,
ArrayRef<const Expr *> Args,
SourceLocation Loc);
void EmitMissingFormatAttributesDiagnostic(const FunctionDecl *Caller);

UuidAttr *mergeUuidAttr(Decl *D, const AttributeCommonInfo &CI,
StringRef UuidAsWritten, MSGuidDecl *GuidDecl);

Expand Down
4 changes: 3 additions & 1 deletion clang/lib/Sema/SemaChecking.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3460,8 +3460,10 @@ void Sema::checkCall(NamedDecl *FDecl, const FunctionProtoType *Proto,
}
}

if (FD)
if (FD) {
diagnoseArgDependentDiagnoseIfAttrs(FD, ThisArg, Args, Loc);
DetectMissingFormatAttributes(FD, Args, Loc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this into the block up there with the comment "Printf and scanf checking"? We care about the same case: calling a function with a FormatAttr. Up there is a loop going over these attributes. Why don't we add this call to the loop, processing a single attribute at a time?

The reason is to avoid adding complexity in the common case of no such attribute. Up there we already have the check, so we're not adding anything.

And I don't think we lose anything. If anything, this should make the check simpler.

  • Callees with multiple attributes are probably rare, but in that case we could still forward multiple format strings as @apple-fcloutier pointed out.
  • If we treat these attributes independently and suggest contradictory attributes, this is probably an issue anyway.

Although in theory I don't see a reason why this is contradictory at all, at least without argument checking: it would simply constrain the format string to strings accepted by multiple formatting languages.

Or, as @apple-fcloutier suggested, just put it into CheckFormatArguments.

}
}

void Sema::CheckConstrainedAuto(const AutoType *AutoT, SourceLocation Loc) {
Expand Down
2 changes: 2 additions & 0 deletions clang/lib/Sema/SemaDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16340,6 +16340,8 @@ Decl *Sema::ActOnFinishFunctionBody(Decl *dcl, Stmt *Body,
}
}

EmitMissingFormatAttributesDiagnostic(FD);

// We might not have found a prototype because we didn't wish to warn on
// the lack of a missing prototype. Try again without the checks for
// whether we want to warn on the missing prototype.
Expand Down
179 changes: 177 additions & 2 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3899,7 +3899,7 @@ static bool handleFormatAttrCommon(Sema &S, Decl *D, const ParsedAttr &AL,

// In C++ the implicit 'this' function parameter also counts, and they are
// counted from one.
bool HasImplicitThisParam = isInstanceMethod(D);
bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D);
Info->NumArgs = getFunctionOrMethodNumParams(D) + HasImplicitThisParam;

Info->Identifier = AL.getArgAsIdent(0)->Ident;
Expand Down Expand Up @@ -4042,7 +4042,7 @@ static void handleCallbackAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
return;
}

bool HasImplicitThisParam = isInstanceMethod(D);
bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D);
int32_t NumArgs = getFunctionOrMethodNumParams(D);

FunctionDecl *FD = D->getAsFunction();
Expand Down Expand Up @@ -5918,6 +5918,181 @@ static void handlePreferredTypeAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
D->addAttr(::new (S.Context) PreferredTypeAttr(S.Context, AL, ParmTSI));
}

// Diagnosing missing format attributes is implemented in two steps:
// 1. Detect missing format attributes while checking function calls.
// 2. Emit diagnostic in part that processes function body.
// For this purpose it is created vector that stores information about format
// attributes. There are no two format attributes with same arguments in a
// vector. Vector could contains attributes that only store information about
// format type (format string and first to check argument are set to -1).
namespace {
std::vector<FormatAttr *> MissingAttributes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang does not use global mutable state. You need to find some other way to keep that around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to keep it around at all.

The patch wants to avoid emitting the warning when the same parameter is used in different format functions, so as to avoid suggesting to add multiple incompatible attributes. But since different format functions tend to use a different language, this is probably extremely rare. We should of course try to avoid false positives, but in this case it seems rather hypothetical, and it seems to come with a high cost.

If we stop worrying about that, we can directly emit the warning when checking a call.

} // end anonymous namespace

// This function is called only if function call is not inside template body.
// TODO: Add call for function calls inside template body.
// Detects and stores missing format attributes in a vector.
void Sema::DetectMissingFormatAttributes(const FunctionDecl *Callee,
ArrayRef<const Expr *> Args,
SourceLocation Loc) {
assert(Callee);

// If there is no caller, exit.
const FunctionDecl *Caller = getCurFunctionDecl();
if (!getCurFunctionDecl())
return;

// Check if callee function is a format function.
// If it is, check if caller function misses format attributes.

if (!Callee->hasAttr<FormatAttr>())
return;

// va_list is not intended to be passed to variadic function.
if (Callee->isVariadic())
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there's just one code base in the world that does it, but Clang supports multiple format attributes on the same function, where one of them has a non-zero format index and any others pass a va_list. See #129954. In that context, bailing out because the function is variadic is too eager.


// Check if va_list is passed to callee function.
// If va_list is not passed, return.
bool hasVaList = false;
for (const auto *Param : Callee->parameters()) {
if (Param->getOriginalType().getCanonicalType() ==
getASTContext().getBuiltinVaListType().getCanonicalType()) {
hasVaList = true;
break;
}
}
if (!hasVaList)
return;
Comment on lines +5955 to +5966
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute is also about checking the format string. So I don't think we should bail out here either. Just forwarding a parameter as format string should be enough to trigger the warning. Only for the fix-it we have to figure out what happens to the arguments.


unsigned int FormatArgumentIndexOffset =
checkIfMethodHasImplicitObjectParameter(Callee) ? 2 : 1;

// If callee function is format function and format arguments are not
// relevant to emit diagnostic, save only information about format type
// (format index and first-to-check argument index are set to -1).
// Information about format type is later used to determine if there are
// more than one format type found.

unsigned int NumArgs = Args.size();
// Check if function has format attribute with forwarded format string.
IdentifierInfo *AttrType;
const ParmVarDecl *FormatStringArg;
if (!llvm::any_of(
Callee->specific_attrs<FormatAttr>(), [&](const FormatAttr *Attr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to call this from the loop in checkCall that goes through FDecl->specific_attrs<FormatAttr>(). Then we'd be looking at a single attribute here.

AttrType = Attr->getType();

int OffsetFormatIndex =
Attr->getFormatIdx() - FormatArgumentIndexOffset;
if (OffsetFormatIndex < 0 || (unsigned)OffsetFormatIndex >= NumArgs)
return false;
Comment on lines +5987 to +5988
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this actually happen? I would have thought we get an error when the index is out of bounds and don't even get here.


if (const auto *FormatArgExpr = dyn_cast<DeclRefExpr>(
Args[OffsetFormatIndex]->IgnoreParenCasts()))
if (FormatStringArg = dyn_cast_or_null<ParmVarDecl>(
FormatArgExpr->getReferencedDeclOfCallee()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct for what you're trying to do, but doesn't clang emit a warning for un-parenthesized assignments in if conditions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not enough? We already have a DeclRefExpr, so I don't think we need the generic Expr::getReferencedDeclOfCallee.

Suggested change
FormatArgExpr->getReferencedDeclOfCallee()))
FormatArgExpr->getDecl()))

return true;
return false;
})) {
MissingAttributes.push_back(
FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
return;
}

unsigned ArgumentIndexOffset =
checkIfMethodHasImplicitObjectParameter(Caller) ? 2 : 1;

unsigned NumOfCallerFunctionParams = Caller->getNumParams();

// Compare caller and callee function format attribute arguments (archetype
// and format string). If they don't match, caller misses format attribute.
if (llvm::any_of(
Caller->specific_attrs<FormatAttr>(), [&](const FormatAttr *Attr) {
if (Attr->getType() != AttrType)
return false;
int OffsetFormatIndex = Attr->getFormatIdx() - ArgumentIndexOffset;

if (OffsetFormatIndex < 0 ||
(unsigned)OffsetFormatIndex >= NumOfCallerFunctionParams)
return false;

if (Caller->parameters()[OffsetFormatIndex] != FormatStringArg)
return false;

return true;
})) {
MissingAttributes.push_back(
FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
return;
}

// Get format string index
int FormatStringIndex =
FormatStringArg->getFunctionScopeIndex() + ArgumentIndexOffset;

// Get first argument index
int FirstToCheck = Caller->isVariadic()
? (NumOfCallerFunctionParams + ArgumentIndexOffset)
: 0;

// Do not add duplicate in a vector of missing format attributes.
if (!llvm::any_of(MissingAttributes, [&](const FormatAttr *Attr) {
return Attr->getType() == AttrType &&
Attr->getFormatIdx() == FormatStringIndex &&
Attr->getFirstArg() == FirstToCheck;
}))
MissingAttributes.push_back(FormatAttr::CreateImplicit(
getASTContext(), AttrType, FormatStringIndex, FirstToCheck, Loc));
Comment on lines +6038 to +6045
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just not bother doing this and emit the warning right here.

}

// This function is called only if caller function is not template.
// TODO: Add call for template functions.
// Emits missing format attribute diagnostics.
void Sema::EmitMissingFormatAttributesDiagnostic(const FunctionDecl *Caller) {
const clang::IdentifierInfo *AttrType = MissingAttributes[0]->getType();
for (unsigned i = 1; i < MissingAttributes.size(); ++i) {
if (AttrType != MissingAttributes[i]->getType()) {
// Clear vector of missing attributes because it could be used in
// diagnosing missing format attributes in another caller.
MissingAttributes.clear();
return;
}
}

for (const FormatAttr *FA : MissingAttributes) {
// If format index and first-to-check argument index are negative, it means
// that this attribute is only saved for multiple format types checking.
if (FA->getFormatIdx() < 0 || FA->getFirstArg() < 0)
continue;

// If caller function has format attributes and callee format attribute type
// mismatches caller attribute type, do not emit diagnostic.
if (Caller->hasAttr<FormatAttr>() &&
!llvm::any_of(Caller->specific_attrs<FormatAttr>(),
[FA](const FormatAttr *FunctionAttr) {
return FA->getType() == FunctionAttr->getType();
}))
continue;

// Emit diagnostic
SourceLocation Loc = Caller->getFirstDecl()->getLocation();
Diag(Loc, diag::warn_missing_format_attribute)
<< FA->getType() << Caller
<< FixItHint::CreateInsertion(Loc,
(llvm::Twine("__attribute__((format(") +
FA->getType()->getName() + ", " +
llvm::Twine(FA->getFormatIdx()) + ", " +
llvm::Twine(FA->getFirstArg()) + ")))")
.str());
Diag(FA->getLocation(), diag::note_format_function) << FA->getType();
}

// Clear vector of missing attributes after emitting diagnostics for caller
// function because it could be used in diagnosing missing format attributes
// in another caller.
MissingAttributes.clear();
}

//===----------------------------------------------------------------------===//
// Microsoft specific attribute handlers.
//===----------------------------------------------------------------------===//
Expand Down
Loading
Loading