-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang] Catch missing format attributes #105479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[clang] Catch missing format attributes #105479
Conversation
@llvm/pr-subscribers-clang Author: Budimir Aranđelović (budimirarandjelovichtec) ChangesEnable flag -Wmissing-format-attribute to catch missing attributes. Fixes #60718 Patch is 40.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/105479.diff 9 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 6adf57da42e656..f9fd4a20ea6bf6 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -707,6 +707,9 @@ Improvements to Clang's diagnostics
- Clang now diagnoses integer constant expressions that are folded to a constant value as an extension in more circumstances. Fixes #GH59863
+- Clang now diagnoses missing format attributes for non-template functions and
+ class/struct/union members. Fixes #GH60718
+
Improvements to Clang's time-trace
----------------------------------
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index 2241f8481484e2..da6a3b2fe3571b 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -525,7 +525,6 @@ def MainReturnType : DiagGroup<"main-return-type">;
def MaxUnsignedZero : DiagGroup<"max-unsigned-zero">;
def MissingBraces : DiagGroup<"missing-braces">;
def MissingDeclarations: DiagGroup<"missing-declarations">;
-def : DiagGroup<"missing-format-attribute">;
def MissingIncludeDirs : DiagGroup<"missing-include-dirs">;
def MissingNoreturn : DiagGroup<"missing-noreturn">;
def MultiChar : DiagGroup<"multichar">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 940f9ac226ca87..75e51c5aa166d2 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -1031,6 +1031,9 @@ def err_opencl_invalid_param : Error<
def err_opencl_invalid_return : Error<
"declaring function return value of type %0 is not allowed %select{; did you forget * ?|}1">;
def warn_enum_value_overflow : Warning<"overflow in enumeration value">;
+def warn_missing_format_attribute : Warning<
+ "diagnostic behavior may be improved by adding the %0 format attribute to the declaration of %1">,
+ InGroup<DiagGroup<"missing-format-attribute">>, DefaultIgnore;
def warn_pragma_options_align_reset_failed : Warning<
"#pragma options align=reset failed: %0">,
InGroup<IgnoredPragmas>;
diff --git a/clang/include/clang/Sema/Attr.h b/clang/include/clang/Sema/Attr.h
index 3f0b10212789a4..37c124ca7b454a 100644
--- a/clang/include/clang/Sema/Attr.h
+++ b/clang/include/clang/Sema/Attr.h
@@ -123,6 +123,13 @@ inline bool isInstanceMethod(const Decl *D) {
return false;
}
+inline bool checkIfMethodHasImplicitObjectParameter(const Decl *D) {
+ if (const auto *MethodDecl = dyn_cast<CXXMethodDecl>(D))
+ return MethodDecl->isInstance() &&
+ !MethodDecl->hasCXXExplicitFunctionObjectParameter();
+ return false;
+}
+
/// Diagnose mutually exclusive attributes when present on a given
/// declaration. Returns true if diagnosed.
template <typename AttrTy>
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 57994f4033922c..9022ac86edf817 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -4602,6 +4602,10 @@ class Sema final : public SemaBase {
enum class RetainOwnershipKind { NS, CF, OS };
+ void DiagnoseMissingFormatAttributes(Stmt *Body, const FunctionDecl *FDecl);
+ std::vector<FormatAttr *>
+ GetMissingFormatAttributes(Stmt *Body, const FunctionDecl *FDecl);
+
UuidAttr *mergeUuidAttr(Decl *D, const AttributeCommonInfo &CI,
StringRef UuidAsWritten, MSGuidDecl *GuidDecl);
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 66eeaa8e6f7777..29c28d4b567513 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -15925,6 +15925,8 @@ Decl *Sema::ActOnFinishFunctionBody(Decl *dcl, Stmt *Body,
}
}
+ DiagnoseMissingFormatAttributes(Body, FD);
+
// We might not have found a prototype because we didn't wish to warn on
// the lack of a missing prototype. Try again without the checks for
// whether we want to warn on the missing prototype.
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index f2cd46d1e7c932..4634dd8b9cce80 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3508,7 +3508,7 @@ static void handleFormatAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
// In C++ the implicit 'this' function parameter also counts, and they are
// counted from one.
- bool HasImplicitThisParam = isInstanceMethod(D);
+ bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D);
unsigned NumArgs = getFunctionOrMethodNumParams(D) + HasImplicitThisParam;
IdentifierInfo *II = AL.getArgAsIdent(0)->Ident;
@@ -3621,7 +3621,7 @@ static void handleCallbackAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
return;
}
- bool HasImplicitThisParam = isInstanceMethod(D);
+ bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D);
int32_t NumArgs = getFunctionOrMethodNumParams(D);
FunctionDecl *FD = D->getAsFunction();
@@ -5320,6 +5320,218 @@ static void handlePreferredTypeAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
D->addAttr(::new (S.Context) PreferredTypeAttr(S.Context, AL, ParmTSI));
}
+// This function is called only if function call is not inside template body.
+// TODO: Add call for function calls inside template body.
+// Emit warnings if parent function misses format attributes.
+void Sema::DiagnoseMissingFormatAttributes(Stmt *Body,
+ const FunctionDecl *FDecl) {
+ assert(FDecl);
+
+ // If there are no function body, exit.
+ if (!Body)
+ return;
+
+ // Get missing format attributes
+ std::vector<FormatAttr *> MissingFormatAttributes =
+ GetMissingFormatAttributes(Body, FDecl);
+ if (MissingFormatAttributes.empty())
+ return;
+
+ // Check if there are more than one format type found. In that case do not
+ // emit diagnostic.
+ const clang::IdentifierInfo *AttrType = MissingFormatAttributes[0]->getType();
+ if (llvm::any_of(MissingFormatAttributes, [&](const FormatAttr *Attr) {
+ return AttrType != Attr->getType();
+ }))
+ return;
+
+ for (const FormatAttr *FA : MissingFormatAttributes) {
+ // If format index and first-to-check argument index are negative, it means
+ // that this attribute is only saved for multiple format types checking.
+ if (FA->getFormatIdx() < 0 || FA->getFirstArg() < 0)
+ continue;
+
+ // Emit diagnostic
+ SourceLocation Loc = FDecl->getLocation();
+ Diag(Loc, diag::warn_missing_format_attribute)
+ << FA->getType() << FDecl
+ << FixItHint::CreateInsertion(Loc,
+ (llvm::Twine("__attribute__((format(") +
+ FA->getType()->getName() + ", " +
+ llvm::Twine(FA->getFormatIdx()) + ", " +
+ llvm::Twine(FA->getFirstArg()) + ")))")
+ .str());
+ }
+}
+
+// Returns vector of format attributes. There are no two attributes with same
+// arguments in returning vector. There can be attributes that effectivelly only
+// store information about format type.
+std::vector<FormatAttr *>
+Sema::GetMissingFormatAttributes(Stmt *Body, const FunctionDecl *FDecl) {
+ unsigned int FunctionFormatArgumentIndexOffset =
+ checkIfMethodHasImplicitObjectParameter(FDecl) ? 2 : 1;
+
+ std::vector<FormatAttr *> MissingAttributes;
+
+ // Iterate over body statements.
+ for (auto *Child : Body->children()) {
+ // If child statement is compound statement, recursively get missing
+ // attributes.
+ if (dyn_cast_or_null<CompoundStmt>(Child)) {
+ std::vector<FormatAttr *> CompoundStmtMissingAttributes =
+ GetMissingFormatAttributes(Child, FDecl);
+
+ // If there are already missing attributes with same arguments, do not add
+ // duplicates.
+ for (FormatAttr *FA : CompoundStmtMissingAttributes) {
+ if (!llvm::any_of(MissingAttributes, [&](const FormatAttr *Attr) {
+ return FA->getType() == Attr->getType() &&
+ FA->getFormatIdx() == Attr->getFormatIdx() &&
+ FA->getFirstArg() == Attr->getFirstArg();
+ }))
+ MissingAttributes.push_back(FA);
+ }
+
+ continue;
+ }
+
+ ValueStmt *VS = dyn_cast_or_null<ValueStmt>(Child);
+ if (!VS)
+ continue;
+ Expr *TheExpr = VS->getExprStmt();
+ if (!TheExpr)
+ continue;
+ CallExpr *TheCall = dyn_cast_or_null<CallExpr>(TheExpr);
+ if (!TheCall)
+ continue;
+ const FunctionDecl *ChildFunction =
+ dyn_cast_or_null<FunctionDecl>(TheCall->getCalleeDecl());
+ if (!ChildFunction)
+ continue;
+
+ Expr **Args = TheCall->getArgs();
+ unsigned int NumArgs = TheCall->getNumArgs();
+
+ // If child expression is function, check if it is format function.
+ // If it is, check if parent function misses format attributes.
+
+ // If child function is format function and format arguments are not
+ // relevant to emit diagnostic, save only information about format type
+ // (format index and first-to-check argument index are set to -1).
+ // Information about format type is later used to determine if there are
+ // more than one format type found.
+
+ unsigned int ChildFunctionFormatArgumentIndexOffset =
+ checkIfMethodHasImplicitObjectParameter(ChildFunction) ? 2 : 1;
+
+ // Check if function has format attribute with forwarded format string.
+ IdentifierInfo *AttrType;
+ const ParmVarDecl *FormatArg;
+ if (llvm::any_of(ChildFunction->specific_attrs<FormatAttr>(),
+ [&](const FormatAttr *Attr) {
+ AttrType = Attr->getType();
+
+ int OffsetFormatIndex =
+ Attr->getFormatIdx() -
+ ChildFunctionFormatArgumentIndexOffset;
+ if (OffsetFormatIndex < 0 ||
+ (unsigned)OffsetFormatIndex >= NumArgs)
+ return true;
+
+ const auto *FormatArgExpr = dyn_cast<DeclRefExpr>(
+ Args[OffsetFormatIndex]->IgnoreParenCasts());
+ if (!FormatArgExpr)
+ return true;
+
+ FormatArg = dyn_cast_or_null<ParmVarDecl>(
+ FormatArgExpr->getReferencedDeclOfCallee());
+ if (!FormatArg)
+ return true;
+
+ return false;
+ })) {
+ MissingAttributes.push_back(
+ FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
+ continue;
+ }
+
+ // Do not add in a vector format attributes whose type is different than
+ // parent function attribute type.
+ if (llvm::any_of(FDecl->specific_attrs<FormatAttr>(),
+ [&](const FormatAttr *FunctionAttr) {
+ return AttrType != FunctionAttr->getType();
+ }))
+ continue;
+
+ // Check if format string argument is parent function parameter.
+ int StringIndex = 0;
+ if (!llvm::any_of(FDecl->parameters(), [&](const ParmVarDecl *Param) {
+ if (Param != FormatArg)
+ return false;
+
+ StringIndex = Param->getFunctionScopeIndex() +
+ FunctionFormatArgumentIndexOffset;
+
+ return true;
+ })) {
+ MissingAttributes.push_back(
+ FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
+ continue;
+ }
+
+ unsigned NumOfParentFunctionParams = FDecl->getNumParams();
+
+ // Compare parent and calling function format attribute arguments (archetype
+ // and format string).
+ if (llvm::any_of(
+ FDecl->specific_attrs<FormatAttr>(), [&](const FormatAttr *Attr) {
+ if (Attr->getType() != AttrType)
+ return false;
+ int OffsetFormatIndex =
+ Attr->getFormatIdx() - FunctionFormatArgumentIndexOffset;
+
+ if (OffsetFormatIndex < 0 ||
+ (unsigned)OffsetFormatIndex >= NumOfParentFunctionParams)
+ return false;
+
+ if (FDecl->parameters()[OffsetFormatIndex] != FormatArg)
+ return false;
+
+ return true;
+ })) {
+ MissingAttributes.push_back(
+ FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
+ continue;
+ }
+
+ // Get first argument index
+ int FirstToCheck = [&]() -> unsigned {
+ if (!FDecl->isVariadic())
+ return 0;
+ const auto *FirstToCheckArg = Args[NumArgs - 1]->IgnoreParenCasts();
+ if (!FirstToCheckArg ||
+ FirstToCheckArg->getType().getCanonicalType() !=
+ Context.getBuiltinVaListType().getCanonicalType())
+ return 0;
+ return NumOfParentFunctionParams + FunctionFormatArgumentIndexOffset;
+ }();
+
+ // If there are already attributes which arguments matches arguments
+ // detected in this iteration, do not add new attribute as it would be
+ // duplicate.
+ if (!llvm::any_of(MissingAttributes, [&](const FormatAttr *Attr) {
+ return Attr->getType() == AttrType &&
+ Attr->getFormatIdx() == StringIndex &&
+ Attr->getFirstArg() == FirstToCheck;
+ }))
+ MissingAttributes.push_back(FormatAttr::CreateImplicit(
+ getASTContext(), AttrType, StringIndex, FirstToCheck));
+ }
+
+ return MissingAttributes;
+}
+
//===----------------------------------------------------------------------===//
// Microsoft specific attribute handlers.
//===----------------------------------------------------------------------===//
diff --git a/clang/test/Sema/attr-format-missing.c b/clang/test/Sema/attr-format-missing.c
new file mode 100644
index 00000000000000..e887e11d767040
--- /dev/null
+++ b/clang/test/Sema/attr-format-missing.c
@@ -0,0 +1,393 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,c_diagnostics -Wmissing-format-attribute %s
+// RUN: %clang_cc1 -fsyntax-only -Wmissing-format-attribute -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s --check-prefixes=CHECK,C-CHECK
+// RUN: %clang_cc1 -fsyntax-only -x c++ -verify=expected,cpp_diagnostics -Wmissing-format-attribute %s
+// RUN: %clang_cc1 -fsyntax-only -x c++ -verify=expected,cpp_diagnostics -std=c++2b -Wmissing-format-attribute %s
+// RUN: %clang_cc1 -fsyntax-only -x c++ -verify=expected,cpp_diagnostics -std=c++23 -Wmissing-format-attribute %s
+// RUN: not %clang_cc1 -fsyntax-only -x c++ -Wmissing-format-attribute -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s
+
+#ifndef __cplusplus
+typedef __CHAR16_TYPE__ char16_t;
+typedef __CHAR32_TYPE__ char32_t;
+typedef __WCHAR_TYPE__ wchar_t;
+#endif
+
+typedef __SIZE_TYPE__ size_t;
+typedef __builtin_va_list va_list;
+
+__attribute__((__format__(__printf__, 1, 2)))
+int printf(const char *, ...); // #printf
+
+__attribute__((__format__(__scanf__, 1, 2)))
+int scanf(const char *, ...); // #scanf
+
+__attribute__((__format__(__printf__, 1, 0)))
+int vprintf(const char *, va_list); // #vprintf
+
+__attribute__((__format__(__scanf__, 1, 0)))
+int vscanf(const char *, va_list); // #vscanf
+
+__attribute__((__format__(__printf__, 2, 0)))
+int vsprintf(char *, const char *, va_list); // #vsprintf
+
+__attribute__((__format__(__printf__, 3, 0)))
+int vsnprintf(char *ch, size_t, const char *, va_list); // #vsnprintf
+
+__attribute__((__format__(__scanf__, 1, 4)))
+void f1(char *out, const size_t len, const char *format, ... /* args */) // #f1
+{
+ va_list args;
+ vsnprintf(out, len, format, args); // expected-no-warning@#f1
+}
+
+__attribute__((__format__(__printf__, 1, 4)))
+void f2(char *out, const size_t len, const char *format, ... /* args */) // #f2
+{
+ va_list args;
+ vsnprintf(out, len, format, args); // expected-warning@#f2 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f2'}}
+ // CHECK: fix-it:"{{.*}}":{[[@LINE-4]]:6-[[@LINE-4]]:6}:"__attribute__((format(printf, 3, 4)))"
+}
+
+void f3(char *out, va_list args) // #f3
+{
+ vprintf(out, args); // expected-warning@#f3 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f3'}}
+ // CHECK: fix-it:"{{.*}}":{[[@LINE-3]]:6-[[@LINE-3]]:6}:"__attribute__((format(printf, 1, 0)))"
+}
+
+void f4(char* out, ... /* args */) // #f4
+{
+ va_list args;
+ vprintf("test", args); // expected-no-warning@#f4
+
+ const char *ch;
+ vprintf(ch, args); // expected-no-warning@#f4
+}
+
+void f5(va_list args) // #f5
+{
+ char *ch;
+ vscanf(ch, args); // expected-no-warning@#f5
+}
+
+void f6(char *out, va_list args) // #f6
+{
+ char *ch;
+ vprintf(ch, args); // expected-no-warning@#f6
+ vprintf("test", args); // expected-no-warning@#f6
+ vprintf(out, args); // expected-warning@#f6 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f6'}}
+ // CHECK: fix-it:"{{.*}}":{[[@LINE-6]]:6-[[@LINE-6]]:6}:"__attribute__((format(printf, 1, 0)))"
+}
+
+void f7(const char *out, ... /* args */) // #f7
+{
+ va_list args;
+
+ vscanf(out, args); // expected-warning@#f7 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f7'}}
+ // CHECK: fix-it:"{{.*}}":{[[@LINE-5]]:6-[[@LINE-5]]:6}:"__attribute__((format(scanf, 1, 2)))"
+}
+
+void f8(const char *out, ... /* args */) // #f8
+{
+ va_list args;
+
+ vscanf(out, args); // expected-no-warning@#f8
+ vprintf(out, args); // expected-no-warning@#f8
+}
+
+void f9(const char out[], ... /* args */) // #f9
+{
+ va_list args;
+ char *ch;
+ vprintf(ch, args); // expected-no-warning
+ vsprintf(ch, out, args); // expected-warning@#f9 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f9'}}
+ // CHECK: fix-it:"{{.*}}":{[[@LINE-6]]:6-[[@LINE-6]]:6}:"__attribute__((format(printf, 1, 2)))"
+}
+
+void f10(const wchar_t *out, ... /* args */) // #f10
+{
+ va_list args;
+ vscanf(out, args);
+#if (defined(__aarch64__) && !defined(_WIN64)) || (defined(__arm__) && !defined(_WIN32))
+ // c_diagnostics-warning@-2 {{incompatible pointer types passing 'const wchar_t *' (aka 'const unsigned int *') to parameter of type 'const char *'}}
+#elif __SIZEOF_WCHAR_T__ == 4
+ // c_diagnostics-warning@-4 {{incompatible pointer types passing 'const wchar_t *' (aka 'const int *') to parameter of type 'const char *'}}
+#else
+ // c_diagnostics-warning@-6 {{incompatible pointer types passing 'const wchar_t *' (aka 'const unsigned short *') to parameter of type 'const char *'}}
+#endif
+ // c_diagnostics-note@#vscanf {{passing argument to parameter here}}
+ // c_diagnostics-warning@#f10 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f10'}}
+ // C-CHECK: fix-it:"{{.*}}":{[[@LINE-13]]:6-[[@LINE-13]]:6}:"__attribute__((format(scanf, 1, 2)))"
+ // cpp_diagnostics-error@-11 {{no matching function for call to 'vscanf'}}
+ // cpp_diagnostics-note@#vscanf {{candidate function not viable: no known conversion from 'const wchar_t *' to 'const char *' for 1st argument}}
+}
+
+void f11(const wchar_t *out, ... /* args */) // #f11
+{
+ va_list args;
+ vscanf((const char *) out, args); // expected-warning@#f11 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f11'}}
+ // CHECK: fix-it:"{{.*}}":{[[@LINE-4]]:6-[[@LINE-4]]:6}:"__attribute__((format(scanf, 1, 2)))"
+}
+
+void f12(const wchar_t *out, ... /* args */) // #f12
+{
+ va_list args;
+ vscanf((char *) out, args); // expected-warning@#f12 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f12'}}
+ // CHECK: fix-it:"{{.*}}":{[[@LINE-4]]:6-[[@LINE-4]]:6}:"__attribute__((format(sc...
[truncated]
|
9f61f95
to
9a9169f
Compare
Can you explain what changes were made to address the various issues which caused the earlier revert? I've tried diffing the PRs to see what the changes are, but my git-fu is insufficient to the task. :-D |
Just for convenience I extracted update diff into #106649 |
I made changes in several places. Here are major changes:
Relevant parts of code are reviewed in #106649 . |
e57e2fb
to
11aef7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Let's wait few days if there are any opinions. Then, someone from my company will merge it. |
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
if (FDecl->parameters()[OffsetFormatIndex] != FormatArg) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't we already found FormatArg
in FDecl->parameters()
in the previous any_of
? I presume that parameters have to be unique.
Couldn't we just compare Attr->getFormatIdx()
with StringIndex
? The attribute that we'll be suggesting below contains exactly that index.
In fact I wonder if we couldn't also rely on the check for duplicates here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In previous, any_of
it was found callee
format argument. In this any_of
, caller
format argument is checked with callee
one. If they mismatch, it means that caller
misses format attribute.
11aef7c
to
6300500
Compare
d845def
to
d04f97b
Compare
33135f7
to
e4faa01
Compare
38d9734
to
3ffb989
Compare
3ffb989
to
052fc92
Compare
052fc92
to
26b6997
Compare
848ebca
to
1a096f6
Compare
1a096f6
to
f720ba7
Compare
f720ba7
to
2571d6f
Compare
2571d6f
to
e5f4b28
Compare
|
e5f4b28
to
502404e
Compare
Ping @vitalybuka @aaronpuchert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand your change, it can only trip when -Wformat-nonliteral also trips. I think that you should implement this as a note/fixit on -Wformat-nonliteral. You should be able to leverage checkFormatString
's value traversal to have the same precision as -Wformat checking. I don't think that your change correctly identifies this case, for instance:
void foo(const char *a, ...) {
va_list ap;
const char *const b = a;
va_start(ap, a);
vprintf(b, ap);
va_end(ap);
}
whereas -Wformat will dig that deep.
|
||
// va_list is not intended to be passed to variadic function. | ||
if (Callee->isVariadic()) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that there's just one code base in the world that does it, but Clang supports multiple format attributes on the same function, where one of them has a non-zero format index and any others pass a va_list. See #129954. In that context, bailing out because the function is variadic is too eager.
if (const auto *FormatArgExpr = dyn_cast<DeclRefExpr>( | ||
Args[OffsetFormatIndex]->IgnoreParenCasts())) | ||
if (FormatStringArg = dyn_cast_or_null<ParmVarDecl>( | ||
FormatArgExpr->getReferencedDeclOfCallee())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct for what you're trying to do, but doesn't clang emit a warning for un-parenthesized assignments in if conditions?
// vector. Vector could contains attributes that only store information about | ||
// format type (format string and first to check argument are set to -1). | ||
namespace { | ||
std::vector<FormatAttr *> MissingAttributes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang does not use global mutable state. You need to find some other way to keep that around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to keep it around at all.
The patch wants to avoid emitting the warning when the same parameter is used in different format functions, so as to avoid suggesting to add multiple incompatible attributes. But since different format functions tend to use a different language, this is probably extremely rare. We should of course try to avoid false positives, but in this case it seems rather hypothetical, and it seems to come with a high cost.
If we stop worrying about that, we can directly emit the warning when checking a call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand your change, it can only trip when
-Wformat-nonliteral
also trips. I think that you should implement this as a note/fixit on-Wformat-nonliteral
.
Good observation, and I agree. It's even documented that this is one way to fix the warning:
Clang implements this mostly the same way as GCC, but there is a difference for functions that accept a
va_list
argument (for example,vprintf
). GCC does not emit-Wformat-nonliteral
warning for calls to such functions. Clang does not warn if the format string comes from a function parameter, where the function is annotated with a compatible attribute, otherwise it warns.
An interesting question is whether we should have a separate warning about forwarding that doesn't warn about other nonliterals. But I don't see why one would want that. After all the idea of these warnings is to follow format strings all the way up the stack until we reach a constant string that we can check.
This could also simplify the change a lot.
// vector. Vector could contains attributes that only store information about | ||
// format type (format string and first to check argument are set to -1). | ||
namespace { | ||
std::vector<FormatAttr *> MissingAttributes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to keep it around at all.
The patch wants to avoid emitting the warning when the same parameter is used in different format functions, so as to avoid suggesting to add multiple incompatible attributes. But since different format functions tend to use a different language, this is probably extremely rare. We should of course try to avoid false positives, but in this case it seems rather hypothetical, and it seems to come with a high cost.
If we stop worrying about that, we can directly emit the warning when checking a call.
diagnoseArgDependentDiagnoseIfAttrs(FD, ThisArg, Args, Loc); | ||
DetectMissingFormatAttributes(FD, Args, Loc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this into the block up there with the comment "Printf and scanf checking"? We care about the same case: calling a function with a FormatAttr
. Up there is a loop going over these attributes. Why don't we add this call to the loop, processing a single attribute at a time?
The reason is to avoid adding complexity in the common case of no such attribute. Up there we already have the check, so we're not adding anything.
And I don't think we lose anything. If anything, this should make the check simpler.
- Callees with multiple attributes are probably rare, but in that case we could still forward multiple format strings as @apple-fcloutier pointed out.
- If we treat these attributes independently and suggest contradictory attributes, this is probably an issue anyway.
Although in theory I don't see a reason why this is contradictory at all, at least without argument checking: it would simply constrain the format string to strings accepted by multiple formatting languages.
Or, as @apple-fcloutier suggested, just put it into CheckFormatArguments
.
// Check if va_list is passed to callee function. | ||
// If va_list is not passed, return. | ||
bool hasVaList = false; | ||
for (const auto *Param : Callee->parameters()) { | ||
if (Param->getOriginalType().getCanonicalType() == | ||
getASTContext().getBuiltinVaListType().getCanonicalType()) { | ||
hasVaList = true; | ||
break; | ||
} | ||
} | ||
if (!hasVaList) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attribute is also about checking the format string. So I don't think we should bail out here either. Just forwarding a parameter as format string should be enough to trigger the warning. Only for the fix-it we have to figure out what happens to the arguments.
IdentifierInfo *AttrType; | ||
const ParmVarDecl *FormatStringArg; | ||
if (!llvm::any_of( | ||
Callee->specific_attrs<FormatAttr>(), [&](const FormatAttr *Attr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to call this from the loop in checkCall
that goes through FDecl->specific_attrs<FormatAttr>()
. Then we'd be looking at a single attribute here.
if (OffsetFormatIndex < 0 || (unsigned)OffsetFormatIndex >= NumArgs) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this actually happen? I would have thought we get an error when the index is out of bounds and don't even get here.
// Do not add duplicate in a vector of missing format attributes. | ||
if (!llvm::any_of(MissingAttributes, [&](const FormatAttr *Attr) { | ||
return Attr->getType() == AttrType && | ||
Attr->getFormatIdx() == FormatStringIndex && | ||
Attr->getFirstArg() == FirstToCheck; | ||
})) | ||
MissingAttributes.push_back(FormatAttr::CreateImplicit( | ||
getASTContext(), AttrType, FormatStringIndex, FirstToCheck, Loc)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just not bother doing this and emit the warning right here.
void f2(char *out, const size_t len, const char *format, ... /* args */) // #f2 | ||
{ | ||
va_list args; | ||
vsnprintf(out, len, format, args); // expected-warning@#f2 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f2'}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should warn on the call where we found out. Yes, the declaration is where the fix should be, but this is typically indicated with a note. The warning should be where we see the issue. The issue can not be seen in the declaration, but in the call forwarding the parameter.
Another reason is that the attribute should be added to the declaration in a header file, not to the definition in the source file.
The reason why for example -Wmissing-noreturn
behaves different is that there is no place in the function body where we notice that the function never returns. It's a control flow graph walk that finds no paths to a return. On the other hand:
bool b;
[[noreturn]] void f() {
if (b) return;
}
produces
<stdin>:3:10: warning: function 'f' declared 'noreturn' should not return [-Winvalid-noreturn]
3 | if (b) return;
| ^
<stdin>:4:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn]
4 | }
| ^
GCC emits the warning on the forwarded parameter. In the printf
test case:
vprintf(out, args);
^
vscanf(out, args); | ||
vprintf(out, args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I don't see why we shouldn't warn on each of these calls. If the first call triggers a warning, why does the second call make it go away? This might be seen as leading to contradictory suggestions, but this function doesn't really make sense: the same arguments can't be both arguments to printf
and scanf
, unless the list is empty.
|
||
void f16(char *out, va_list args) // #f16 | ||
{ | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blocks seem to be leftovers from an earlier version where you walked the function body and unpacked CompoundStmt
s. Now this doesn't make so much sense anymore, so I'd drop them and then drop duplicate test cases.
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
// va_list is not intended to be passed to variadic function. | ||
if (CalleeFunction->isVariadic()) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your second example is what I meant. The attribute can have first-to-check
= 0, in which case only the format string is checked. Such an attribute can be on a non-variadic function. Since we're forwarding the parameter, we should still suggest adding the attribute so that the format string can be checked when f1X
is called.
Enable flag -Wmissing-format-attribute to catch missing attributes.
Fixes #60718