Skip to content

Commit 998709b

Browse files
author
Thomas Preud'homme
committed
[FileCheck] Add precision to format specifier
Add printf-style precision specifier to pad numbers to a given number of digits when matching them if the value is smaller than the given precision. This works on both empty numeric expression (e.g. variable definition from input) and when matching a numeric expression. The syntax is as follows: [[#%.<precision><format specifier>, ...] where <format specifier> is optional and ... can be a variable definition or not with an empty expression or not. In the absence of a precision specifier, a variable definition will accept leading zeros. Reviewed By: jhenderson, grimar Differential Revision: https://reviews.llvm.org/D81667
1 parent 719548d commit 998709b

File tree

5 files changed

+329
-116
lines changed

5 files changed

+329
-116
lines changed

llvm/docs/CommandGuide/FileCheck.rst

Lines changed: 46 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -730,35 +730,60 @@ numeric expression constraint based on those variables via a numeric
730730
substitution. This allows ``CHECK:`` directives to verify a numeric relation
731731
between two numbers, such as the need for consecutive registers to be used.
732732

733-
The syntax to define a numeric variable is ``[[#%<fmtspec>,<NUMVAR>:]]`` where:
733+
The syntax to capture a numeric value is
734+
``[[#%<fmtspec>,<NUMVAR>:]]`` where:
734735

735-
* ``%<fmtspec>`` is an optional scanf-style matching format specifier to
736-
indicate what number format to match (e.g. hex number). Currently accepted
737-
format specifiers are ``%u``, ``%d``, ``%x`` and ``%X``. If absent, the
738-
format specifier defaults to ``%u``.
736+
* ``%<fmtspec>,`` is an optional format specifier to indicate what number
737+
format to match and the minimum number of digits to expect.
738+
739+
* ``<NUMVAR>:`` is an optional definition of variable ``<NUMVAR>`` from the
740+
captured value.
741+
742+
The syntax of ``<fmtspec>`` is: ``.<precision><conversion specifier>`` where:
743+
744+
* ``.<precision>`` is an optional printf-style precision specifier in which
745+
``<precision>`` indicates the minimum number of digits that the value matched
746+
must have, expecting leading zeros if needed.
747+
748+
* ``<conversion specifier>`` is an optional scanf-style conversion specifier
749+
to indicate what number format to match (e.g. hex number). Currently
750+
accepted format specifiers are ``%u``, ``%d``, ``%x`` and ``%X``. If absent,
751+
the format specifier defaults to ``%u``.
739752

740-
* ``<NUMVAR>`` is the name of the numeric variable to define to the matching
741-
value.
742753

743754
For example:
744755

745756
.. code-block:: llvm
746757
747-
; CHECK: mov r[[#REG:]], 0x[[#%X,IMM:]]
758+
; CHECK: mov r[[#REG:]], 0x[[#%.8X,ADDR:]]
748759
749-
would match ``mov r5, 0xF0F0`` and set ``REG`` to the value ``5`` and ``IMM``
750-
to the value ``0xF0F0``.
760+
would match ``mov r5, 0x0000FEFE`` and set ``REG`` to the value ``5`` and
761+
``ADDR`` to the value ``0xFEFE``. Note that due to the precision it would fail
762+
to match ``mov r5, 0xFEFE``.
751763

752-
The syntax of a numeric substitution is
753-
``[[#%<fmtspec>: <constraint> <expr>]]`` where:
764+
As a result of the numeric variable definition being optional, it is possible
765+
to only check that a numeric value is present in a given format. This can be
766+
useful when the value itself is not useful, for instance:
754767

755-
* ``%<fmtspec>`` is the same matching format specifier as for defining numeric
756-
variables but acting as a printf-style format to indicate how a numeric
757-
expression value should be matched against. If absent, the format specifier
758-
is inferred from the matching format of the numeric variable(s) used by the
759-
expression constraint if any, and defaults to ``%u`` if no numeric variable
760-
is used. In case of conflict between matching formats of several numeric
761-
variables the format specifier is mandatory.
768+
.. code-block:: gas
769+
770+
; CHECK-NOT: mov r0, r[[#]]
771+
772+
to check that a value is synthesized rather than moved around.
773+
774+
775+
The syntax of a numeric substitution is
776+
``[[#%<fmtspec>, <constraint> <expr>]]`` where:
777+
778+
* ``<fmtspec>`` is the same format specifier as for defining a variable but
779+
in this context indicating how a numeric expression value should be matched
780+
against. If absent, both components of the format specifier are inferred from
781+
the matching format of the numeric variable(s) used by the expression
782+
constraint if any, and defaults to ``%u`` if no numeric variable is used,
783+
denoting that the value should be unsigned with no leading zeros. In case of
784+
conflict between format specifiers of several numeric variables, the
785+
conversion specifier becomes mandatory but the precision specifier remains
786+
optional.
762787

763788
* ``<constraint>`` is the constraint describing how the value to match must
764789
relate to the value of the numeric expression. The only currently accepted
@@ -824,20 +849,11 @@ but would not match the text:
824849
Due to ``7`` being unequal to ``5 + 1`` and ``a0463443`` being unequal to
825850
``a0463440 + 7``.
826851

827-
The syntax also supports an empty expression, equivalent to writing {{[0-9]+}},
828-
for cases where the input must contain a numeric value but the value itself
829-
does not matter:
830-
831-
.. code-block:: gas
832-
833-
; CHECK-NOT: mov r0, r[[#]]
834-
835-
to check that a value is synthesized rather than moved around.
836852

837853
A numeric variable can also be defined to the result of a numeric expression,
838854
in which case the numeric expression constraint is checked and if verified the
839-
variable is assigned to the value. The unified syntax for both defining numeric
840-
variables and checking a numeric expression is thus
855+
variable is assigned to the value. The unified syntax for both checking a
856+
numeric expression and capturing its value into a numeric variable is thus
841857
``[[#%<fmtspec>,<NUMVAR>: <constraint> <expr>]]`` with each element as
842858
described previously. One can use this syntax to make a testcase more
843859
self-describing by using variables instead of values:

llvm/lib/Support/FileCheck.cpp

Lines changed: 91 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -43,16 +43,28 @@ StringRef ExpressionFormat::toString() const {
4343
llvm_unreachable("unknown expression format");
4444
}
4545

46-
Expected<StringRef> ExpressionFormat::getWildcardRegex() const {
46+
Expected<std::string> ExpressionFormat::getWildcardRegex() const {
47+
auto CreatePrecisionRegex = [this](StringRef S) {
48+
return (S + Twine('{') + Twine(Precision) + "}").str();
49+
};
50+
4751
switch (Value) {
4852
case Kind::Unsigned:
49-
return StringRef("[0-9]+");
53+
if (Precision)
54+
return CreatePrecisionRegex("([1-9][0-9]*)?[0-9]");
55+
return std::string("[0-9]+");
5056
case Kind::Signed:
51-
return StringRef("-?[0-9]+");
57+
if (Precision)
58+
return CreatePrecisionRegex("-?([1-9][0-9]*)?[0-9]");
59+
return std::string("-?[0-9]+");
5260
case Kind::HexUpper:
53-
return StringRef("[0-9A-F]+");
61+
if (Precision)
62+
return CreatePrecisionRegex("([1-9A-F][0-9A-F]*)?[0-9A-F]");
63+
return std::string("[0-9A-F]+");
5464
case Kind::HexLower:
55-
return StringRef("[0-9a-f]+");
65+
if (Precision)
66+
return CreatePrecisionRegex("([1-9a-f][0-9a-f]*)?[0-9a-f]");
67+
return std::string("[0-9a-f]+");
5668
default:
5769
return createStringError(std::errc::invalid_argument,
5870
"trying to match value with invalid format");
@@ -61,27 +73,47 @@ Expected<StringRef> ExpressionFormat::getWildcardRegex() const {
6173

6274
Expected<std::string>
6375
ExpressionFormat::getMatchingString(ExpressionValue IntegerValue) const {
76+
uint64_t AbsoluteValue;
77+
StringRef SignPrefix = IntegerValue.isNegative() ? "-" : "";
78+
6479
if (Value == Kind::Signed) {
6580
Expected<int64_t> SignedValue = IntegerValue.getSignedValue();
6681
if (!SignedValue)
6782
return SignedValue.takeError();
68-
return itostr(*SignedValue);
83+
if (*SignedValue < 0)
84+
AbsoluteValue = cantFail(IntegerValue.getAbsolute().getUnsignedValue());
85+
else
86+
AbsoluteValue = *SignedValue;
87+
} else {
88+
Expected<uint64_t> UnsignedValue = IntegerValue.getUnsignedValue();
89+
if (!UnsignedValue)
90+
return UnsignedValue.takeError();
91+
AbsoluteValue = *UnsignedValue;
6992
}
7093

71-
Expected<uint64_t> UnsignedValue = IntegerValue.getUnsignedValue();
72-
if (!UnsignedValue)
73-
return UnsignedValue.takeError();
94+
std::string AbsoluteValueStr;
7495
switch (Value) {
7596
case Kind::Unsigned:
76-
return utostr(*UnsignedValue);
97+
case Kind::Signed:
98+
AbsoluteValueStr = utostr(AbsoluteValue);
99+
break;
77100
case Kind::HexUpper:
78-
return utohexstr(*UnsignedValue, /*LowerCase=*/false);
79101
case Kind::HexLower:
80-
return utohexstr(*UnsignedValue, /*LowerCase=*/true);
102+
AbsoluteValueStr = utohexstr(AbsoluteValue, Value == Kind::HexLower);
103+
break;
81104
default:
82105
return createStringError(std::errc::invalid_argument,
83106
"trying to match value with invalid format");
84107
}
108+
109+
if (Precision > AbsoluteValueStr.size()) {
110+
unsigned LeadingZeros = Precision - AbsoluteValueStr.size();
111+
return (Twine(SignPrefix) + std::string(LeadingZeros, '0') +
112+
AbsoluteValueStr)
113+
.str();
114+
}
115+
116+
return (Twine(SignPrefix) + AbsoluteValueStr).str();
85117
}
86118

87119
Expected<ExpressionValue>
@@ -720,41 +752,59 @@ Expected<std::unique_ptr<Expression>> Pattern::parseNumericSubstitutionBlock(
720752
StringRef DefExpr = StringRef();
721753
DefinedNumericVariable = None;
722754
ExpressionFormat ExplicitFormat = ExpressionFormat();
755+
unsigned Precision = 0;
723756

724757
// Parse format specifier (NOTE: ',' is also an argument seperator).
725758
size_t FormatSpecEnd = Expr.find(',');
726759
size_t FunctionStart = Expr.find('(');
727760
if (FormatSpecEnd != StringRef::npos && FormatSpecEnd < FunctionStart) {
728-
Expr = Expr.ltrim(SpaceChars);
729-
if (!Expr.consume_front("%"))
761+
StringRef FormatExpr = Expr.take_front(FormatSpecEnd);
762+
Expr = Expr.drop_front(FormatSpecEnd + 1);
763+
FormatExpr = FormatExpr.trim(SpaceChars);
764+
if (!FormatExpr.consume_front("%"))
730765
return ErrorDiagnostic::get(
731-
SM, Expr, "invalid matching format specification in expression");
732-
733-
// Check for unknown matching format specifier and set matching format in
734-
// class instance representing this expression.
735-
SMLoc fmtloc = SMLoc::getFromPointer(Expr.data());
736-
switch (popFront(Expr)) {
737-
case 'u':
738-
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::Unsigned);
739-
break;
740-
case 'd':
741-
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::Signed);
742-
break;
743-
case 'x':
744-
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::HexLower);
745-
break;
746-
case 'X':
747-
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::HexUpper);
748-
break;
749-
default:
750-
return ErrorDiagnostic::get(SM, fmtloc,
751-
"invalid format specifier in expression");
766+
SM, FormatExpr,
767+
"invalid matching format specification in expression");
768+
769+
// Parse precision.
770+
if (FormatExpr.consume_front(".")) {
771+
if (FormatExpr.consumeInteger(10, Precision))
772+
return ErrorDiagnostic::get(SM, FormatExpr,
773+
"invalid precision in format specifier");
752774
}
753775

754-
Expr = Expr.ltrim(SpaceChars);
755-
if (!Expr.consume_front(","))
776+
if (!FormatExpr.empty()) {
777+
// Check for unknown matching format specifier and set matching format in
778+
// class instance representing this expression.
779+
SMLoc FmtLoc = SMLoc::getFromPointer(FormatExpr.data());
780+
switch (popFront(FormatExpr)) {
781+
case 'u':
782+
ExplicitFormat =
783+
ExpressionFormat(ExpressionFormat::Kind::Unsigned, Precision);
784+
break;
785+
case 'd':
786+
ExplicitFormat =
787+
ExpressionFormat(ExpressionFormat::Kind::Signed, Precision);
788+
break;
789+
case 'x':
790+
ExplicitFormat =
791+
ExpressionFormat(ExpressionFormat::Kind::HexLower, Precision);
792+
break;
793+
case 'X':
794+
ExplicitFormat =
795+
ExpressionFormat(ExpressionFormat::Kind::HexUpper, Precision);
796+
break;
797+
default:
798+
return ErrorDiagnostic::get(SM, FmtLoc,
799+
"invalid format specifier in expression");
800+
}
801+
}
802+
803+
FormatExpr = FormatExpr.ltrim(SpaceChars);
804+
if (!FormatExpr.empty())
756805
return ErrorDiagnostic::get(
757-
SM, Expr, "invalid matching format specification in expression");
806+
SM, FormatExpr,
807+
"invalid matching format specification in expression");
758808
}
759809

760810
// Save variable definition expression if any.
@@ -814,7 +864,7 @@ Expected<std::unique_ptr<Expression>> Pattern::parseNumericSubstitutionBlock(
814864
Format = *ImplicitFormat;
815865
}
816866
if (!Format)
817-
Format = ExpressionFormat(ExpressionFormat::Kind::Unsigned);
867+
Format = ExpressionFormat(ExpressionFormat::Kind::Unsigned, Precision);
818868

819869
std::unique_ptr<Expression> ExpressionPointer =
820870
std::make_unique<Expression>(std::move(ExpressionASTPointer), Format);
@@ -948,7 +998,7 @@ bool Pattern::parsePattern(StringRef PatternStr, StringRef Prefix,
948998
bool IsLegacyLineExpr = false;
949999
StringRef DefName;
9501000
StringRef SubstStr;
951-
StringRef MatchRegexp;
1001+
std::string MatchRegexp;
9521002
size_t SubstInsertIdx = RegExStr.size();
9531003

9541004
// Parse string variable or legacy @LINE expression.
@@ -992,7 +1042,7 @@ bool Pattern::parsePattern(StringRef PatternStr, StringRef Prefix,
9921042
return true;
9931043
}
9941044
DefName = Name;
995-
MatchRegexp = MatchStr;
1045+
MatchRegexp = MatchStr.str();
9961046
} else {
9971047
if (IsPseudo) {
9981048
MatchStr = OrigMatchStr;

llvm/lib/Support/FileCheckImpl.h

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -53,15 +53,17 @@ struct ExpressionFormat {
5353

5454
private:
5555
Kind Value;
56+
unsigned Precision = 0;
5657

5758
public:
5859
/// Evaluates a format to true if it can be used in a match.
5960
explicit operator bool() const { return Value != Kind::NoFormat; }
6061

6162
/// Define format equality: formats are equal if neither is NoFormat and
62-
/// their kinds are the same.
63+
/// their kinds and precision are the same.
6364
bool operator==(const ExpressionFormat &Other) const {
64-
return Value != Kind::NoFormat && Value == Other.Value;
65+
return Value != Kind::NoFormat && Value == Other.Value &&
66+
Precision == Other.Precision;
6567
}
6668

6769
bool operator!=(const ExpressionFormat &Other) const {
@@ -76,12 +78,14 @@ struct ExpressionFormat {
7678
StringRef toString() const;
7779

7880
ExpressionFormat() : Value(Kind::NoFormat){};
79-
explicit ExpressionFormat(Kind Value) : Value(Value){};
80-
81-
/// \returns a wildcard regular expression StringRef that matches any value
82-
/// in the format represented by this instance, or an error if the format is
83-
/// NoFormat.
84-
Expected<StringRef> getWildcardRegex() const;
81+
explicit ExpressionFormat(Kind Value) : Value(Value), Precision(0){};
82+
explicit ExpressionFormat(Kind Value, unsigned Precision)
83+
: Value(Value), Precision(Precision){};
84+
85+
/// \returns a wildcard regular expression string that matches any value in
86+
/// the format represented by this instance and no other value, or an error
87+
/// if the format is NoFormat.
88+
Expected<std::string> getWildcardRegex() const;
8589

8690
/// \returns the string representation of \p Value in the format represented
8791
/// by this instance, or an error if conversion to this format failed or the

0 commit comments

Comments
 (0)