Skip to content

Commit f96aa63

Browse files
Address RFC, review feedback
1 parent a9780a6 commit f96aa63

File tree

4 files changed

+57
-137
lines changed

4 files changed

+57
-137
lines changed

clang/include/clang/Basic/AttrDocs.td

Lines changed: 40 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3760,20 +3760,23 @@ other. For instance:
37603760
static int wind_speed;
37613761

37623762
void say_hi(const char *fmt) {
3763-
printf(fmt, first_name, todays_temperature); // warning: format string is not a string literal
3764-
printf(fmt, first_name, wind_speed); // warning: format string is not a string literal
3763+
printf(fmt, first_name, todays_temperature);
3764+
// ^ warning: format string is not a string literal
3765+
printf(fmt, first_name, wind_speed);
3766+
// ^ warning: format string is not a string literal
37653767
}
37663768

37673769
int main() {
37683770
say_hi("hello %s, it is %g degrees outside");
3769-
say_hi("hello %s, it is %d degrees outside!"); // no diagnostic
3771+
say_hi("hello %s, it is %d degrees outside!");
3772+
// ^ no diagnostic, but %d cannot format doubles
37703773
}
37713774

37723775
In this example, ``fmt`` is expected to format a ``const char *`` and a
37733776
``double``, but these values are not passed to ``say_hi``. Without the
37743777
``format`` attribute (which cannot apply in this case), the -Wformat-nonliteral
3775-
diagnostic triggers in the body of ``say_hi``, and incorrect ``say_hi`` call
3776-
sites do not trigger a diagnostic.
3778+
diagnostic unnecessarily triggers in the body of ``say_hi``, and incorrect
3779+
``say_hi`` call sites do not trigger a diagnostic.
37773780

37783781
To complement the ``format`` attribute, Clang also defines the
37793782
``format_matches`` attribute. Its syntax is similar to the ``format``
@@ -3803,10 +3806,31 @@ The third argument to ``format_matches`` is expected to evaluate to a **C string
38033806
literal** even when the format string would normally be a different type for the
38043807
given flavor, like a ``CFStringRef`` or a ``NSString *``.
38053808

3809+
The only requirement on the format string literal is that it has specifiers
3810+
that are compatible with the arguments that will be used. It can contain
3811+
arbitrary non-format characters. For instance, for the purposes of compile-time
3812+
validation, ``"%s scored %g%% on her test"`` and ``"%s%g"`` are interchangeable
3813+
as the format string argument. As a means of self-documentation, users may
3814+
prefer the former when it provides a useful example of an expected format
3815+
string.
3816+
38063817
In the implementation of a function with the ``format_matches`` attribute,
38073818
format verification works as if the format string was identical to the one
38083819
specified in the attribute.
38093820

3821+
.. code-block:: c
3822+
3823+
__attribute__((__format_matches__(printf, 1, "%s %g")))
3824+
void say_hi(const char *fmt) {
3825+
printf(fmt, "person", 546);
3826+
// ^ warning: format specifies type 'double' but the
3827+
// argument has type 'int'
3828+
// note: format string is defined here:
3829+
// __attribute__((__format_matches__(printf, 1, "%s %g")))
3830+
// ^~
3831+
}
3832+
3833+
38103834
At the call sites of functions with the ``format_matches`` attribute, format
38113835
verification instead compares the two format strings to evaluate their
38123836
equivalence. Each format flavor defines equivalence between format specifiers.
@@ -3817,15 +3841,14 @@ a negative example, ``%ld`` is incompatible with ``%d``.
38173841

38183842
Do note the following un-obvious cases:
38193843

3820-
* Passing ``NULL`` as the format string is accepted without diagnostics.
3844+
* Passing ``NULL`` as the format string does not trigger format diagnostics.
38213845
* When the format string is not NULL, it cannot _miss_ specifiers, even in
38223846
trailing positions. For instance, ``%d`` is not accepted when the required
38233847
format is ``%d %d %d``.
3824-
* Specifiers for integers as small or smaller than ``int`` (such as ``%hhd``)
3825-
are all mutually compatible because standard argument promotion ensures that
3826-
integers are at least the size of ``int`` when passed as variadic arguments.
3827-
With ``-Wformat-signedness``, mixing specifier for types with a different
3828-
signedness still results in a diagnostic.
3848+
* While checks for the ``format`` attribute tolerate sone size mismatches
3849+
that standard argument promotion renders immaterial (such as formatting an
3850+
``int`` with ``%hhd``, which specifies a ``char``-sized integer), checks for
3851+
``format_matches`` require specified argument sizes to match exactly.
38293852
* Format strings expecting a variable modifier (such as ``%*s``) are
38303853
incompatible with format strings that would itemize the variable modifiers
38313854
(such as ``%i %s``), even if the two specify ABI-compatible argument lists.
@@ -3837,6 +3860,12 @@ Do note the following un-obvious cases:
38373860
This is not overridable with ``-Wformat-pedantic`` or its inverse, which
38383861
control similar behavior in ``-Wformat``.
38393862

3863+
At this time, clang implements ``format_matches`` only for format types in the
3864+
``printf`` family. This includes variants such as Apple's NSString format and
3865+
the FreeBSD ``kprintf``, but excludes ``scanf``. Using a known but unsupported
3866+
format silently fails in order to be compatible with other implementations that
3867+
would support these formats.
3868+
38403869
}];
38413870
}
38423871

clang/lib/AST/AttrImpl.cpp

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,6 @@ unsigned AlignedAttr::getAlignment(ASTContext &Ctx) const {
271271
}
272272

273273
StringLiteral *FormatMatchesAttr::getFormatString() const {
274-
// This cannot go in headers because StringLiteral and Expr may be incomplete.
275274
return cast<StringLiteral>(getExpectedFormat());
276275
}
277276

clang/lib/Sema/SemaChecking.cpp

Lines changed: 9 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -7234,9 +7234,11 @@ void EquatableFormatArgument::VerifyCompatible(
72347234

72357235
switch (ArgType.matchesArgType(S.Context, Other.ArgType)) {
72367236
case MK::Match:
7237-
case MK::MatchPromotion:
72387237
break;
72397238

7239+
case MK::MatchPromotion:
7240+
// Per consensus reached at https://discourse.llvm.org/t/-/83076/12,
7241+
// MatchPromotion is treated as a failure by format_matches.
72407242
case MK::NoMatch:
72417243
case MK::NoMatchTypeConfusion:
72427244
case MK::NoMatchPromotionTypeConfusion:
@@ -8225,36 +8227,6 @@ class CheckScanfHandler : public CheckFormatHandler {
82258227
void HandleIncompleteScanList(const char *start, const char *end) override;
82268228
};
82278229

8228-
class DecomposeScanfHandler : public CheckScanfHandler {
8229-
llvm::SmallVectorImpl<EquatableFormatArgument> &Specs;
8230-
bool HadError;
8231-
8232-
DecomposeScanfHandler(Sema &s, const FormatStringLiteral *fexpr,
8233-
const Expr *origFormatExpr, Sema::FormatStringType type,
8234-
unsigned firstDataArg, unsigned numDataArgs,
8235-
const char *beg, Sema::FormatArgumentPassingKind APK,
8236-
ArrayRef<const Expr *> Args, unsigned formatIdx,
8237-
bool inFunctionCall, Sema::VariadicCallType CallType,
8238-
llvm::SmallBitVector &CheckedVarArgs,
8239-
UncoveredArgHandler &UncoveredArg,
8240-
llvm::SmallVectorImpl<EquatableFormatArgument> &Specs)
8241-
: CheckScanfHandler(s, fexpr, origFormatExpr, type, firstDataArg,
8242-
numDataArgs, beg, APK, Args, formatIdx,
8243-
inFunctionCall, CallType, CheckedVarArgs,
8244-
UncoveredArg),
8245-
Specs(Specs), HadError(false) {}
8246-
8247-
public:
8248-
static bool
8249-
GetSpecifiers(Sema &S, const FormatStringLiteral *FSL,
8250-
Sema::FormatStringType type, bool InFunctionCall,
8251-
llvm::SmallVectorImpl<EquatableFormatArgument> &Args);
8252-
8253-
bool HandleScanfSpecifier(const analyze_scanf::ScanfSpecifier &FS,
8254-
const char *startSpecifier,
8255-
unsigned specifierLen) override;
8256-
};
8257-
82588230
} // namespace
82598231

82608232
void CheckScanfHandler::HandleIncompleteScanList(const char *start,
@@ -8401,53 +8373,6 @@ bool CheckScanfHandler::HandleScanfSpecifier(
84018373
return true;
84028374
}
84038375

8404-
bool DecomposeScanfHandler::GetSpecifiers(
8405-
Sema &S, const FormatStringLiteral *FSL, Sema::FormatStringType Type,
8406-
bool InFunctionCall, llvm::SmallVectorImpl<EquatableFormatArgument> &Args) {
8407-
StringRef Data = FSL->getString();
8408-
const char *Str = Data.data();
8409-
llvm::SmallBitVector BV;
8410-
UncoveredArgHandler UA;
8411-
DecomposeScanfHandler H(S, FSL, FSL->getFormatString(), Type, 0, 0, Str,
8412-
Sema::FAPK_Elsewhere, {FSL->getFormatString()}, 0,
8413-
InFunctionCall, Sema::VariadicDoesNotApply, BV, UA,
8414-
Args);
8415-
8416-
if (!analyze_format_string::ParseScanfString(H, Str, Str + Data.size(),
8417-
S.getLangOpts(),
8418-
S.Context.getTargetInfo()))
8419-
H.DoneProcessing();
8420-
if (H.HadError)
8421-
return false;
8422-
8423-
std::sort(
8424-
Args.begin(), Args.end(),
8425-
[](const EquatableFormatArgument &A, const EquatableFormatArgument &B) {
8426-
return A.getPosition() < B.getPosition();
8427-
});
8428-
return true;
8429-
}
8430-
8431-
bool DecomposeScanfHandler::HandleScanfSpecifier(
8432-
const analyze_scanf::ScanfSpecifier &FS, const char *startSpecifier,
8433-
unsigned specifierLen) {
8434-
if (!CheckScanfHandler::HandleScanfSpecifier(FS, startSpecifier,
8435-
specifierLen)) {
8436-
HadError = true;
8437-
return false;
8438-
}
8439-
8440-
const auto &CS = FS.getConversionSpecifier();
8441-
Specs.emplace_back(
8442-
getSpecifierRange(startSpecifier, specifierLen),
8443-
getLocationOfByte(CS.getStart()), FS.getLengthModifier().getKind(),
8444-
CS.getCharacters(), FS.getArgType(S.Context),
8445-
EquatableFormatArgument::FAR_Data, EquatableFormatArgument::SS_None,
8446-
FS.usesPositionalArg() ? FS.getPositionalArgIndex() - 1 : Specs.size(),
8447-
0);
8448-
return true;
8449-
}
8450-
84518376
static void CompareFormatSpecifiers(Sema &S, const StringLiteral *Ref,
84528377
ArrayRef<EquatableFormatArgument> RefArgs,
84538378
const StringLiteral *Fmt,
@@ -8556,26 +8481,13 @@ static void CheckFormatString(
85568481
}
85578482
}
85588483
} else if (Type == Sema::FST_Scanf) {
8559-
if (ReferenceFormatString == nullptr) {
8560-
CheckScanfHandler H(S, FExpr, OrigFormatExpr, Type, firstDataArg,
8561-
numDataArgs, Str, APK, Args, format_idx,
8562-
inFunctionCall, CallType, CheckedVarArgs,
8563-
UncoveredArg);
8484+
CheckScanfHandler H(S, FExpr, OrigFormatExpr, Type, firstDataArg,
8485+
numDataArgs, Str, APK, Args, format_idx, inFunctionCall,
8486+
CallType, CheckedVarArgs, UncoveredArg);
85648487

8565-
if (!analyze_format_string::ParseScanfString(
8566-
H, Str, Str + StrLen, S.getLangOpts(), S.Context.getTargetInfo()))
8567-
H.DoneProcessing();
8568-
} else {
8569-
llvm::SmallVector<EquatableFormatArgument, 9> RefArgs, FmtArgs;
8570-
FormatStringLiteral RefLit = ReferenceFormatString;
8571-
if (DecomposeScanfHandler::GetSpecifiers(S, &RefLit, Type, inFunctionCall,
8572-
RefArgs) &&
8573-
DecomposeScanfHandler::GetSpecifiers(S, FExpr, Type, inFunctionCall,
8574-
FmtArgs)) {
8575-
CompareFormatSpecifiers(S, ReferenceFormatString, RefArgs,
8576-
FExpr->getFormatString(), FmtArgs);
8577-
}
8578-
}
8488+
if (!analyze_format_string::ParseScanfString(
8489+
H, Str, Str + StrLen, S.getLangOpts(), S.Context.getTargetInfo()))
8490+
H.DoneProcessing();
85798491
} // TODO: handle other formats
85808492
}
85818493

clang/test/Sema/format-string-matches.c

Lines changed: 8 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,6 @@ int printf(const char *fmt, ...);
99
__attribute__((format(printf, 1, 0)))
1010
int vprintf(const char *fmt, va_list);
1111

12-
__attribute__((format(scanf, 1, 2)))
13-
int scanf(const char *fmt, ...);
14-
1512
// MARK: -
1613
// Calling printf with a format from format_matches(printf) diagnoses with
1714
// that format string
@@ -41,30 +38,6 @@ void vformat(const char *fmt, ...) {
4138
va_end(ap);
4239
}
4340

44-
// MARK: -
45-
// Calling scanf
46-
__attribute__((format_matches(scanf, 1, "%hhi %g"))) // expected-note 3{{comparing with this specifier}}
47-
void scan(const char *fmt) {
48-
char i;
49-
float g;
50-
scanf(fmt, &i, &g);
51-
}
52-
53-
__attribute__((format_matches(scanf, 1, "%hhi %g"))) // expected-note{{format string is defined here}}
54-
void scan2(const char *fmt) {
55-
char i;
56-
double g;
57-
scanf(fmt, &i, &g); // expected-warning{{format specifies type 'float *' but the argument has type 'double *'}}
58-
}
59-
60-
void call_scan(void) {
61-
scan("%hhd %e");
62-
scan("%hd %Le"); // \
63-
expected-warning{{format specifier 'hd' is incompatible with 'hhi'}} \
64-
expected-warning{{format specifier 'Le' is incompatible with 'g'}}
65-
scan("%s %p"); // expected-warning{{format specifier 'p' is incompatible with 'g'}}
66-
}
67-
6841
// MARK: -
6942
// Calling a function with format_matches diagnoses for incompatible formats.
7043

@@ -75,7 +48,8 @@ void cvt_at(const char *c) __attribute__((format_matches(NSString, 1, "%@"))); /
7548
expected-note{{comparing with this format string}}
7649
void cvt_c(const char *c) __attribute__((format_matches(printf, 1, "%c"))); // expected-note{{comparing with this specifier}}
7750
void cvt_u(const char *c) __attribute__((format_matches(printf, 1, "%u"))); // expected-note{{comparing with this specifier}}
78-
void cvt_i(const char *c) __attribute__((format_matches(printf, 1, "%i"))); // expected-note 2{{comparing with this specifier}}
51+
void cvt_hhi(const char *c) __attribute__((format_matches(printf, 1, "%hhi"))); // expected-note 3{{comparing with this specifier}}
52+
void cvt_i(const char *c) __attribute__((format_matches(printf, 1, "%i"))); // expected-note 4{{comparing with this specifier}}
7953
void cvt_p(const char *c) __attribute__((format_matches(printf, 1, "%p")));
8054
void cvt_s(const char *c) __attribute__((format_matches(printf, 1, "%s"))); // expected-note{{comparing with this specifier}}
8155
void cvt_n(const char *c) __attribute__((format_matches(printf, 1, "%n"))); // expected-note{{comparing with this specifier}}
@@ -86,8 +60,14 @@ void test_compatibility(void) {
8660
cvt_c("%u"); // expected-warning{{signedness of format specifier 'u' is incompatible with 'c'}}
8761
cvt_u("%c"); // expected-warning{{signedness of format specifier 'c' is incompatible with 'u'}}
8862

63+
cvt_i("%hi"); // expected-warning{{format specifier 'hi' is incompatible with 'i'}}
64+
cvt_i("%hhi"); // expected-warning{{format specifier 'hhi' is incompatible with 'i'}}
8965
cvt_i("%lli"); // expected-warning{{format specifier 'lli' is incompatible with 'i'}}
9066
cvt_i("%p"); // expected-warning{{format specifier 'p' is incompatible with 'i'}}
67+
cvt_hhi("%hhi");
68+
cvt_hhi("%hi"); // expected-warning{{format specifier 'hi' is incompatible with 'hhi'}}
69+
cvt_hhi("%i"); // expected-warning{{format specifier 'i' is incompatible with 'hhi'}}
70+
cvt_hhi("%li"); // expected-warning{{format specifier 'li' is incompatible with 'hhi'}}
9171
cvt_n("%s"); // expected-warning{{format specifier 's' is incompatible with 'n'}}
9272
cvt_s("%hhn"); // expected-warning{{format specifier 'hhn' is incompatible with 's'}}
9373

0 commit comments

Comments
 (0)