Skip to content

[Clang] Fix parsing of reversible type traits in template arguments #95969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1037,6 +1037,7 @@ Bug Fixes to C++ Support
- Clang now correctly handles unexpanded packs in the template parameter list of a generic lambda expression
(#GH48937)
- Fix a crash when parsing an invalid type-requirement in a requires expression. (#GH51868)
- Fix parsing of built-in type-traits such as ``__is_pointer`` in libstdc++ headers. (#GH95598)

Bug Fixes to AST Handling
^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Parse/Parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -1877,6 +1877,10 @@ class Parser : public CodeCompletionHandler {
UnaryExprOnly,
PrimaryExprOnly
};

bool isRevertibleTypeTrait(const IdentifierInfo *Id,
clang::tok::TokenKind *Kind = nullptr);

ExprResult ParseCastExpression(CastParseKind ParseKind,
bool isAddressOfOperand,
bool &NotCastExpr,
Expand Down
163 changes: 84 additions & 79 deletions clang/lib/Parse/ParseExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -760,6 +760,87 @@ class CastExpressionIdValidator final : public CorrectionCandidateCallback {
};
}

bool Parser::isRevertibleTypeTrait(const IdentifierInfo *II,
tok::TokenKind *Kind) {
if (RevertibleTypeTraits.empty()) {
#define RTT_JOIN(X, Y) X##Y
#define REVERTIBLE_TYPE_TRAIT(Name) \
RevertibleTypeTraits[PP.getIdentifierInfo(#Name)] = RTT_JOIN(tok::kw_, Name)

REVERTIBLE_TYPE_TRAIT(__is_abstract);
REVERTIBLE_TYPE_TRAIT(__is_aggregate);
REVERTIBLE_TYPE_TRAIT(__is_arithmetic);
REVERTIBLE_TYPE_TRAIT(__is_array);
REVERTIBLE_TYPE_TRAIT(__is_assignable);
REVERTIBLE_TYPE_TRAIT(__is_base_of);
REVERTIBLE_TYPE_TRAIT(__is_bounded_array);
REVERTIBLE_TYPE_TRAIT(__is_class);
REVERTIBLE_TYPE_TRAIT(__is_complete_type);
REVERTIBLE_TYPE_TRAIT(__is_compound);
REVERTIBLE_TYPE_TRAIT(__is_const);
REVERTIBLE_TYPE_TRAIT(__is_constructible);
REVERTIBLE_TYPE_TRAIT(__is_convertible);
REVERTIBLE_TYPE_TRAIT(__is_convertible_to);
REVERTIBLE_TYPE_TRAIT(__is_destructible);
REVERTIBLE_TYPE_TRAIT(__is_empty);
REVERTIBLE_TYPE_TRAIT(__is_enum);
REVERTIBLE_TYPE_TRAIT(__is_floating_point);
REVERTIBLE_TYPE_TRAIT(__is_final);
REVERTIBLE_TYPE_TRAIT(__is_function);
REVERTIBLE_TYPE_TRAIT(__is_fundamental);
REVERTIBLE_TYPE_TRAIT(__is_integral);
REVERTIBLE_TYPE_TRAIT(__is_interface_class);
REVERTIBLE_TYPE_TRAIT(__is_layout_compatible);
REVERTIBLE_TYPE_TRAIT(__is_literal);
REVERTIBLE_TYPE_TRAIT(__is_lvalue_expr);
REVERTIBLE_TYPE_TRAIT(__is_lvalue_reference);
REVERTIBLE_TYPE_TRAIT(__is_member_function_pointer);
REVERTIBLE_TYPE_TRAIT(__is_member_object_pointer);
REVERTIBLE_TYPE_TRAIT(__is_member_pointer);
REVERTIBLE_TYPE_TRAIT(__is_nothrow_assignable);
REVERTIBLE_TYPE_TRAIT(__is_nothrow_constructible);
REVERTIBLE_TYPE_TRAIT(__is_nothrow_destructible);
REVERTIBLE_TYPE_TRAIT(__is_nullptr);
REVERTIBLE_TYPE_TRAIT(__is_object);
REVERTIBLE_TYPE_TRAIT(__is_pod);
REVERTIBLE_TYPE_TRAIT(__is_pointer);
REVERTIBLE_TYPE_TRAIT(__is_polymorphic);
REVERTIBLE_TYPE_TRAIT(__is_reference);
REVERTIBLE_TYPE_TRAIT(__is_referenceable);
REVERTIBLE_TYPE_TRAIT(__is_rvalue_expr);
REVERTIBLE_TYPE_TRAIT(__is_rvalue_reference);
REVERTIBLE_TYPE_TRAIT(__is_same);
REVERTIBLE_TYPE_TRAIT(__is_scalar);
REVERTIBLE_TYPE_TRAIT(__is_scoped_enum);
REVERTIBLE_TYPE_TRAIT(__is_sealed);
REVERTIBLE_TYPE_TRAIT(__is_signed);
REVERTIBLE_TYPE_TRAIT(__is_standard_layout);
REVERTIBLE_TYPE_TRAIT(__is_trivial);
REVERTIBLE_TYPE_TRAIT(__is_trivially_assignable);
REVERTIBLE_TYPE_TRAIT(__is_trivially_constructible);
REVERTIBLE_TYPE_TRAIT(__is_trivially_copyable);
REVERTIBLE_TYPE_TRAIT(__is_unbounded_array);
REVERTIBLE_TYPE_TRAIT(__is_union);
REVERTIBLE_TYPE_TRAIT(__is_unsigned);
REVERTIBLE_TYPE_TRAIT(__is_void);
REVERTIBLE_TYPE_TRAIT(__is_volatile);
REVERTIBLE_TYPE_TRAIT(__reference_binds_to_temporary);
#define TRANSFORM_TYPE_TRAIT_DEF(_, Trait) \
REVERTIBLE_TYPE_TRAIT(RTT_JOIN(__, Trait));
#include "clang/Basic/TransformTypeTraits.def"
#undef REVERTIBLE_TYPE_TRAIT
#undef RTT_JOIN
}
llvm::SmallDenseMap<IdentifierInfo *, tok::TokenKind>::iterator Known =
RevertibleTypeTraits.find(II);
if (Known != RevertibleTypeTraits.end()) {
if (Kind)
*Kind = Known->second;
return true;
}
return false;
}

/// Parse a cast-expression, or, if \pisUnaryExpression is true, parse
/// a unary-expression.
///
Expand Down Expand Up @@ -1118,85 +1199,9 @@ ExprResult Parser::ParseCastExpression(CastParseKind ParseKind,
else if (Next.is(tok::l_paren) && Tok.is(tok::identifier) &&
Tok.getIdentifierInfo()->hasRevertedTokenIDToIdentifier()) {
IdentifierInfo *II = Tok.getIdentifierInfo();
// Build up the mapping of revertible type traits, for future use.
if (RevertibleTypeTraits.empty()) {
#define RTT_JOIN(X,Y) X##Y
#define REVERTIBLE_TYPE_TRAIT(Name) \
RevertibleTypeTraits[PP.getIdentifierInfo(#Name)] \
= RTT_JOIN(tok::kw_,Name)

REVERTIBLE_TYPE_TRAIT(__is_abstract);
REVERTIBLE_TYPE_TRAIT(__is_aggregate);
REVERTIBLE_TYPE_TRAIT(__is_arithmetic);
REVERTIBLE_TYPE_TRAIT(__is_array);
REVERTIBLE_TYPE_TRAIT(__is_assignable);
REVERTIBLE_TYPE_TRAIT(__is_base_of);
REVERTIBLE_TYPE_TRAIT(__is_bounded_array);
REVERTIBLE_TYPE_TRAIT(__is_class);
REVERTIBLE_TYPE_TRAIT(__is_complete_type);
REVERTIBLE_TYPE_TRAIT(__is_compound);
REVERTIBLE_TYPE_TRAIT(__is_const);
REVERTIBLE_TYPE_TRAIT(__is_constructible);
REVERTIBLE_TYPE_TRAIT(__is_convertible);
REVERTIBLE_TYPE_TRAIT(__is_convertible_to);
REVERTIBLE_TYPE_TRAIT(__is_destructible);
REVERTIBLE_TYPE_TRAIT(__is_empty);
REVERTIBLE_TYPE_TRAIT(__is_enum);
REVERTIBLE_TYPE_TRAIT(__is_floating_point);
REVERTIBLE_TYPE_TRAIT(__is_final);
REVERTIBLE_TYPE_TRAIT(__is_function);
REVERTIBLE_TYPE_TRAIT(__is_fundamental);
REVERTIBLE_TYPE_TRAIT(__is_integral);
REVERTIBLE_TYPE_TRAIT(__is_interface_class);
REVERTIBLE_TYPE_TRAIT(__is_layout_compatible);
REVERTIBLE_TYPE_TRAIT(__is_literal);
REVERTIBLE_TYPE_TRAIT(__is_lvalue_expr);
REVERTIBLE_TYPE_TRAIT(__is_lvalue_reference);
REVERTIBLE_TYPE_TRAIT(__is_member_function_pointer);
REVERTIBLE_TYPE_TRAIT(__is_member_object_pointer);
REVERTIBLE_TYPE_TRAIT(__is_member_pointer);
REVERTIBLE_TYPE_TRAIT(__is_nothrow_assignable);
REVERTIBLE_TYPE_TRAIT(__is_nothrow_constructible);
REVERTIBLE_TYPE_TRAIT(__is_nothrow_destructible);
REVERTIBLE_TYPE_TRAIT(__is_nullptr);
REVERTIBLE_TYPE_TRAIT(__is_object);
REVERTIBLE_TYPE_TRAIT(__is_pod);
REVERTIBLE_TYPE_TRAIT(__is_pointer);
REVERTIBLE_TYPE_TRAIT(__is_polymorphic);
REVERTIBLE_TYPE_TRAIT(__is_reference);
REVERTIBLE_TYPE_TRAIT(__is_referenceable);
REVERTIBLE_TYPE_TRAIT(__is_rvalue_expr);
REVERTIBLE_TYPE_TRAIT(__is_rvalue_reference);
REVERTIBLE_TYPE_TRAIT(__is_same);
REVERTIBLE_TYPE_TRAIT(__is_scalar);
REVERTIBLE_TYPE_TRAIT(__is_scoped_enum);
REVERTIBLE_TYPE_TRAIT(__is_sealed);
REVERTIBLE_TYPE_TRAIT(__is_signed);
REVERTIBLE_TYPE_TRAIT(__is_standard_layout);
REVERTIBLE_TYPE_TRAIT(__is_trivial);
REVERTIBLE_TYPE_TRAIT(__is_trivially_assignable);
REVERTIBLE_TYPE_TRAIT(__is_trivially_constructible);
REVERTIBLE_TYPE_TRAIT(__is_trivially_copyable);
REVERTIBLE_TYPE_TRAIT(__is_unbounded_array);
REVERTIBLE_TYPE_TRAIT(__is_union);
REVERTIBLE_TYPE_TRAIT(__is_unsigned);
REVERTIBLE_TYPE_TRAIT(__is_void);
REVERTIBLE_TYPE_TRAIT(__is_volatile);
REVERTIBLE_TYPE_TRAIT(__reference_binds_to_temporary);
#define TRANSFORM_TYPE_TRAIT_DEF(_, Trait) \
REVERTIBLE_TYPE_TRAIT(RTT_JOIN(__, Trait));
#include "clang/Basic/TransformTypeTraits.def"
#undef REVERTIBLE_TYPE_TRAIT
#undef RTT_JOIN
}

// If we find that this is in fact the name of a type trait,
// update the token kind in place and parse again to treat it as
// the appropriate kind of type trait.
llvm::SmallDenseMap<IdentifierInfo *, tok::TokenKind>::iterator Known
= RevertibleTypeTraits.find(II);
if (Known != RevertibleTypeTraits.end()) {
Tok.setKind(Known->second);
tok::TokenKind Kind;
if (isRevertibleTypeTrait(II, &Kind)) {
Tok.setKind(Kind);
return ParseCastExpression(ParseKind, isAddressOfOperand,
NotCastExpr, isTypeCast,
isVectorLiteral, NotPrimaryExpression);
Expand Down
9 changes: 9 additions & 0 deletions clang/lib/Parse/ParseTentative.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1385,6 +1385,15 @@ Parser::isCXXDeclarationSpecifier(ImplicitTypenameContext AllowImplicitTypename,
if (!getLangOpts().ObjC && Next.is(tok::identifier))
return TPResult::True;

// If this identifier was reverted from a token ID, and the next token
// is a '(', we assume it to be a use of a type trait, so this
// can never be a type name.
if (Next.is(tok::l_paren) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need a comment above this like:

// If this identifier was reverted from a token ID, and the next token
// is a '(', this is likely to be a use of a type trait. If it is, this
// can never be a type name.

Tok.getIdentifierInfo()->hasRevertedTokenIDToIdentifier() &&
isRevertibleTypeTrait(Tok.getIdentifierInfo())) {
Comment on lines +1391 to +1393
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set the correct tok::TokenKind here too since it will be set later anyway when this is re-parsed as an expression

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not mutate anything in this function

return TPResult::False;
}

if (Next.isNot(tok::coloncolon) && Next.isNot(tok::less)) {
// Determine whether this is a valid expression. If not, we will hit
// a parse error one way or another. In that case, tell the caller that
Expand Down
12 changes: 12 additions & 0 deletions clang/test/Parser/cxx-template-argument.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -141,3 +141,15 @@ namespace r360308_regression {
return a == b;
}
}

namespace GH95598 {
template<typename _Tp, bool _IsPtr = __is_pointer(_Tp)>
struct __is_pointer {};
// expected-warning@-1 {{keyword '__is_pointer' will be made available as an identifier for the remainder of the translation unit}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagnostic is very confusing because it implies the name is only available as an identifier when in reality it's available as either an identifier or a keyword. This seems to be the original functionality though: https://godbolt.org/z/WzTs4T31j (based on the changes in 47642d2 which added this diagnostic).

So I think your changes are aligned with the original intent of the diagnostic, but I've CCed @zygoloid and @rnk just in case others from this era recall differently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've incrementally ended up in a bad place. Here's my memory of what happened:

Originally, the idea was: we provide builtins such as __is_pointer for use by the standard library. But for a bunch of those, libstdc++ provides a type with the same name that works it out for itself. OK, we can work around that: if we see a declaration with that name, then the standard library isn't using our builtin, so we treat it as a non-keyword so that libstdc++ doesn't break.

But then it turns out that boost uses these things as keywords directly, rather than using the standard library functionality. To keep both that and libstdc++ working at the same time, we treat the keyword followed by an open paren as having the builtin meaning even when the keyword has been reverted to being a non-keyword.

And now we're noticing that we also need special behavior in other parts of the parser to treat the keyword followed by an open paren as a special case.

I don't think we should be extending this hack further. I've suggested on the libstdc++ bug report that this be fixed in libstdc++ instead, by stopping using our keywords as their type names. Maybe one day we can then remove it entirely.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we could consider doing would be removing the entire "revertible keyword" mechanism, and instead, in the lexer, classifying these tokens as keywords only if the next pp-token is an lparen. That would probably simplify the handling of these keywords everywhere, and would happen to give the outcome that libstdc++ is expecting in this case too. (It's a breaking change for some weird corner cases, such as producing the keyword via macro expansion and things like bool __is_pointer(true);, but hopefully those things don't happen in practice!)

Copy link
Contributor

@MitalAshok MitalAshok Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zygoloid https://github.com/itanium-cxx-abi/cxx-abi/blob/07427ac5b6ff0799c47e8c754609cf8220c05cc0/gcc-tinfo2.cc#L373 (never mind, this turns out to have never compiled with Clang in the first place. GCC seems to support __is_pointer-as-an-identifier in a lot more positions than Clang though: https://godbolt.org/z/KGTWa1hqY )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zygoloid I prototyped something #96097

One of the test has a default constructor declaration, ie is_pod(), which the lexer approach would struggle to cope with. Let me know what you think

The alternative is to keep status quo or go with the current PR and change the warning to a depreciation warning.
My concern is that until we support older libstdc++ (which will be the case for a long time), we won't be able to rip the hack out

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the historical information @zygoloid!

I don't think we should be extending this hack further. I've suggested on the libstdc++ bug report that this be fixed in libstdc++ instead, by stopping using our keywords as their type names. Maybe one day we can then remove it entirely.

I tend to agree; if we can avoid continuing to extend this hack, I think we should. That said, I'd be curious whether deprecation is highly disruptive or not; my intuition is that we'd struggle with this one because people tend to use boost (and certainly libstdc++) as system headers, and we suppress diagnostics by default in system headers. So I'd expect this deprecation to take a long while before we could remove the hacks due to the long tail of not wanting to break system headers.

The ctor situation would be annoying, but I think the ctor and dtor are the only places we'd have to put in a special hack for "got a keyword where we expected an identifier", so perhaps the blast radius is sufficiently limited for that to be a reasonable approach?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are now patches posted for libstdc++ to avoid using our type trait keywords as identifiers.


template<bool>
struct ts{};

template<typename _Tp>
struct is_pointer : ts<__is_pointer(_Tp)> {};
}
Loading