Skip to content

[C] Diagnose use of C++ keywords in C #137234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
May 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
56a3f3c
[C] Diagnose use of C++ keywords in C
AaronBallman Apr 24, 2025
3b0732e
Fix the parameter handling
AaronBallman Apr 24, 2025
55350f0
Update comment; NFC
AaronBallman Apr 24, 2025
52fe07b
Fix formatting; NFC
AaronBallman Apr 24, 2025
fb976c2
Only perform the check if the diagnostic is enabled
AaronBallman Apr 25, 2025
972a5fa
Add test cases for override and final
AaronBallman Apr 25, 2025
81ac3ce
Handle alternative keywords
AaronBallman Apr 25, 2025
6c5d838
Rename flag
AaronBallman Apr 25, 2025
20dcdae
Temporarily back this change out
AaronBallman Apr 25, 2025
07e44e0
Remove the need for string comparisons
AaronBallman Apr 25, 2025
908162e
Merge remote-tracking branch 'origin/main' into perf/aballman-gh21898
AaronBallman Apr 25, 2025
c8b6f48
Remove old implementation, go back to checking only when enabled
AaronBallman Apr 25, 2025
1308a4e
Merge remote-tracking branch 'origin/main' into perf/aballman-gh21898
AaronBallman Apr 28, 2025
6bd0c2f
Update based on review feedback
AaronBallman Apr 28, 2025
b07e0fd
One more iteration
AaronBallman Apr 28, 2025
5bf4006
Change initialization; NFC
AaronBallman Apr 28, 2025
ac746fa
Merge remote-tracking branch 'origin/main' into perf/aballman-gh21898
AaronBallman Apr 29, 2025
d0551ca
Go with another implementation of the keyword logic
AaronBallman Apr 29, 2025
205aa3c
Merge remote-tracking branch 'origin/main' into perf/aballman-gh21898
AaronBallman Apr 29, 2025
16594b3
Implement in the preprocessor rather than Sema
AaronBallman Apr 29, 2025
6b12e5e
Handle Objective-C @ keywords
AaronBallman Apr 29, 2025
49c7db0
Update for Objective-C again
AaronBallman Apr 30, 2025
485c548
Merge remote-tracking branch 'origin/main' into perf/aballman-gh21898
AaronBallman Apr 30, 2025
c0fbe17
Repair OpenMP test
AaronBallman Apr 30, 2025
c3547f9
Merge remote-tracking branch 'origin/main' into perf/aballman-gh21898
AaronBallman Apr 30, 2025
082d94c
Merge remote-tracking branch 'origin/main' into perf/aballman-gh21898
AaronBallman May 1, 2025
f00bd69
Remove support for ObjC, update release note
AaronBallman May 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,8 @@ C Language Changes
- Added ``-Wimplicit-void-ptr-cast``, grouped under ``-Wc++-compat``, which
diagnoses implicit conversion from ``void *`` to another pointer type as
being incompatible with C++. (#GH17792)
- Added ``-Wc++-keyword``, grouped under ``-Wc++-compat``, which diagnoses when
a C++ keyword is used as an identifier in C. (#GH21898)
- Added ``-Wc++-hidden-decl``, grouped under ``-Wc++-compat``, which diagnoses
use of tag types which are visible in C but not visible in C++ due to scoping
rules. e.g.,
Expand Down Expand Up @@ -481,6 +483,8 @@ Improvements to Clang's diagnostics
- ``-Winitializer-overrides`` and ``-Wreorder-init-list`` are now grouped under
the ``-Wc99-designator`` diagnostic group, as they also are about the
behavior of the C99 feature as it was introduced into C++20. Fixes #GH47037
- ``-Wreserved-identifier`` now fires on reserved parameter names in a function
declaration which is not a definition.

Improvements to Clang's time-trace
----------------------------------
Expand Down
7 changes: 4 additions & 3 deletions clang/include/clang/Basic/DiagnosticGroups.td
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ def C99Compat : DiagGroup<"c99-compat">;
def C23Compat : DiagGroup<"c23-compat">;
def : DiagGroup<"c2x-compat", [C23Compat]>;

def CppKeywordInC : DiagGroup<"c++-keyword">;
def DuplicateDeclSpecifier : DiagGroup<"duplicate-decl-specifier">;
def InitStringTooLongMissingNonString :
DiagGroup<"unterminated-string-initialization">;
Expand All @@ -178,9 +179,9 @@ def ImplicitIntToEnumCast : DiagGroup<"implicit-int-enum-cast",
[ImplicitEnumEnumCast]>;
def TentativeDefnCompat : DiagGroup<"tentative-definition-compat">;
def CXXCompat: DiagGroup<"c++-compat", [ImplicitVoidPtrCast, DefaultConstInit,
ImplicitIntToEnumCast, HiddenCppDecl,
InitStringTooLongForCpp,
TentativeDefnCompat,
ImplicitIntToEnumCast, CppKeywordInC,
HiddenCppDecl, TentativeDefnCompat,
InitStringTooLongForCpp,
DuplicateDeclSpecifier]>;

def ExternCCompat : DiagGroup<"extern-c-compat">;
Expand Down
3 changes: 3 additions & 0 deletions clang/include/clang/Basic/DiagnosticLexKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -421,6 +421,9 @@ def warn_pp_macro_is_reserved_attribute_id : Warning<
def warn_pp_objc_macro_redef_ignored : Warning<
"ignoring redefinition of Objective-C qualifier macro">,
InGroup<DiagGroup<"objc-macro-redefinition">>;
def warn_pp_identifier_is_cpp_keyword : Warning<
"identifier %0 conflicts with a C++ keyword">,
InGroup<CppKeywordInC>, DefaultIgnore;

def pp_invalid_string_literal : Warning<
"invalid string literal, ignoring final '\\'">;
Expand Down
13 changes: 11 additions & 2 deletions clang/include/clang/Basic/IdentifierTable.h
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,11 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
LLVM_PREFERRED_TYPE(bool)
unsigned IsFinal : 1;

// 22 bits left in a 64-bit word.
// True if this identifier would be a keyword in C++ mode.
LLVM_PREFERRED_TYPE(bool)
unsigned IsKeywordInCpp : 1;

// 21 bits left in a 64-bit word.

// Managed by the language front-end.
void *FETokenInfo = nullptr;
Expand All @@ -212,7 +216,7 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
IsFromAST(false), ChangedAfterLoad(false), FEChangedAfterLoad(false),
RevertedTokenID(false), OutOfDate(false), IsModulesImport(false),
IsMangledOpenMPVariantName(false), IsDeprecatedMacro(false),
IsRestrictExpansion(false), IsFinal(false) {}
IsRestrictExpansion(false), IsFinal(false), IsKeywordInCpp(false) {}

public:
IdentifierInfo(const IdentifierInfo &) = delete;
Expand Down Expand Up @@ -444,6 +448,10 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
}
bool isCPlusPlusOperatorKeyword() const { return IsCPPOperatorKeyword; }

/// Return true if this identifier would be a keyword in C++ mode.
bool IsKeywordInCPlusPlus() const { return IsKeywordInCpp; }
void setIsKeywordInCPlusPlus(bool Val = true) { IsKeywordInCpp = Val; }

/// Return true if this token is a keyword in the specified language.
bool isKeyword(const LangOptions &LangOpts) const;

Expand All @@ -462,6 +470,7 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
/// If this returns false, we know that HandleIdentifier will not affect
/// the token.
bool isHandleIdentifierCase() const { return NeedsHandleIdentifier; }
void setHandleIdentifierCase(bool Val = true) { NeedsHandleIdentifier = Val; }

/// Return true if the identifier in its current state was loaded
/// from an AST file.
Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/Basic/TokenKinds.def
Original file line number Diff line number Diff line change
Expand Up @@ -1042,6 +1042,7 @@ ANNOTATION(embed)
#undef TYPE_TRAIT_2
#undef TYPE_TRAIT_1
#undef TYPE_TRAIT
#undef MODULES_KEYWORD
#undef CXX20_KEYWORD
#undef CXX11_KEYWORD
#undef KEYWORD
Expand Down
48 changes: 43 additions & 5 deletions clang/lib/Basic/IdentifierTable.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,32 @@ static KeywordStatus getKeywordStatus(const LangOptions &LangOpts,
return CurStatus;
}

static bool IsKeywordInCpp(unsigned Flags) {
while (Flags != 0) {
unsigned CurFlag = Flags & ~(Flags - 1);
Flags = Flags & ~CurFlag;
switch (static_cast<TokenKey>(CurFlag)) {
case KEYCXX:
case KEYCXX11:
case KEYCXX20:
case BOOLSUPPORT:
case WCHARSUPPORT:
case CHAR8SUPPORT:
return true;
default:
break; // Go to the next flag, try again.
}
}
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this function. Why does this go through all bits in a loop instead of directly checking them? I'd have expected this whole function to be something like return Flags & (KEYALLCXX|BOOLSUPPORT|WCHARSUPPORT|CHAR8SUPPORT);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code was lifted from

but yeah, this seems to be something I can simplify

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated in 0b2ab11 thank you for the catch!

}

static void MarkIdentifierAsKeywordInCpp(IdentifierTable &Table,
StringRef Name) {
IdentifierInfo &II = Table.get(Name, tok::identifier);
II.setIsKeywordInCPlusPlus();
II.setHandleIdentifierCase();
}

/// AddKeyword - This method is used to associate a token ID with specific
/// identifiers because they are language keywords. This causes the lexer to
/// automatically map matching identifiers to specialized token codes.
Expand All @@ -258,8 +284,18 @@ static void AddKeyword(StringRef Keyword,
const LangOptions &LangOpts, IdentifierTable &Table) {
KeywordStatus AddResult = getKeywordStatus(LangOpts, Flags);

// Don't add this keyword if disabled in this language.
if (AddResult == KS_Disabled) return;
// Don't add this keyword if disabled in this language and isn't otherwise
// special.
if (AddResult == KS_Disabled) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional:

It'd be more consistent with the existing code to add an extra KeywordStatus value for this case, analogous to KS_Future. To do that, maybe we'd need to also replace the special macro for C++ operator keywords with a normal flag so that it can be taken into account when computing the KeywordStatus too.

I don't think it's necessary to be consistent here, though; we can always refactor later.

// We do not consider any identifiers to be C++ keywords when in
// Objective-C because @ effectively introduces a custom grammar where C++
// keywords can be used (and similar for selectors). We could enable this
// for Objective-C, but it would require more logic to ensure we do not
// issue compatibility diagnostics in these cases.
if (!LangOpts.ObjC && IsKeywordInCpp(Flags))
MarkIdentifierAsKeywordInCpp(Table, Keyword);
return;
}

IdentifierInfo &Info =
Table.get(Keyword, AddResult == KS_Future ? tok::identifier : TokenCode);
Expand Down Expand Up @@ -304,9 +340,11 @@ void IdentifierTable::AddKeywords(const LangOptions &LangOpts) {
#define ALIAS(NAME, TOK, FLAGS) \
AddKeyword(StringRef(NAME), tok::kw_ ## TOK, \
FLAGS, LangOpts, *this);
#define CXX_KEYWORD_OPERATOR(NAME, ALIAS) \
if (LangOpts.CXXOperatorNames) \
AddCXXOperatorKeyword(StringRef(#NAME), tok::ALIAS, *this);
#define CXX_KEYWORD_OPERATOR(NAME, ALIAS) \
if (LangOpts.CXXOperatorNames) \
AddCXXOperatorKeyword(StringRef(#NAME), tok::ALIAS, *this); \
else \
MarkIdentifierAsKeywordInCpp(*this, StringRef(#NAME));
#define OBJC_AT_KEYWORD(NAME) \
if (LangOpts.ObjC) \
AddObjCKeyword(StringRef(#NAME), tok::objc_##NAME, *this);
Expand Down
5 changes: 5 additions & 0 deletions clang/lib/Lex/Preprocessor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -834,6 +834,11 @@ bool Preprocessor::HandleIdentifier(Token &Identifier) {
II.setIsFutureCompatKeyword(false);
}

// If this identifier would be a keyword in C++, diagnose as a compatibility
// issue.
if (II.IsKeywordInCPlusPlus() && !DisableMacroExpansion)
Diag(Identifier, diag::warn_pp_identifier_is_cpp_keyword) << &II;

// If this is an extension token, diagnose its use.
// We avoid diagnosing tokens that originate from macro definitions.
// FIXME: This warning is disabled in cases where it shouldn't be,
Expand Down
11 changes: 11 additions & 0 deletions clang/lib/Sema/SemaDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,10 @@
#include "clang/Sema/SemaWasm.h"
#include "clang/Sema/Template.h"
#include "llvm/ADT/STLForwardCompat.h"
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSwitch.h"
#include "llvm/Support/SaveAndRestore.h"
#include "llvm/TargetParser/Triple.h"
#include <algorithm>
Expand Down Expand Up @@ -10421,6 +10423,15 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC,
// Finally, we know we have the right number of parameters, install them.
NewFD->setParams(Params);

// If this declarator is a declaration and not a definition, its parameters
// will not be pushed onto a scope chain. That means we will not issue any
// reserved identifier warnings for the declaration, but we will for the
// definition. Handle those here.
if (!D.isFunctionDefinition()) {
for (const ParmVarDecl *PVD : Params)
warnOnReservedIdentifier(PVD);
}

if (D.getDeclSpec().isNoreturnSpecified())
NewFD->addAttr(
C11NoReturnAttr::Create(Context, D.getDeclSpec().getNoreturnSpecLoc()));
Expand Down
10 changes: 8 additions & 2 deletions clang/test/OpenMP/assumes_messages.c
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,14 @@
#pragma omp begin assumes ext // expected-warning {{valid begin assumes clauses start with 'ext_', 'absent', 'contains', 'holds', 'no_openmp', 'no_openmp_routines', 'no_openmp_constructs', 'no_parallelism'; token will be ignored}}
#pragma omp end assumes

#pragma omp assumes ext_123(not allowed) // expected-warning {{'ext_123' clause should not be followed by arguments; tokens will be ignored}} expected-note {{the ignored tokens spans until here}}
#pragma omp begin assumes ext_123(not allowed) // expected-warning {{'ext_123' clause should not be followed by arguments; tokens will be ignored}} expected-note {{the ignored tokens spans until here}}
// FIXME: We should be getting an expected note about where the span of ignored
// tokens ends. However, error recovery ends up lexing the 'not' token,
// emitting a (silenced) diagnostic about use of a C++ keyword in C, and the
// note gets associated with *that* (silenced) diagnostic. This is an existing
// issue that also happens with error recovery of reserved identifiers or
// extension tokens, but is unfortunate nonetheless.
#pragma omp assumes ext_123(not allowed) // expected-warning {{'ext_123' clause should not be followed by arguments; tokens will be ignored}}
#pragma omp begin assumes ext_123(not allowed) // expected-warning {{'ext_123' clause should not be followed by arguments; tokens will be ignored}}
#pragma omp end assumes

#pragma omp end assumes // expected-error {{'#pragma omp end assumes' with no matching '#pragma omp begin assumes'}}
Expand Down
Loading