Skip to content

[clang] Clear NeedsCleaning flag after ExpandBuiltinMacro #133574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mariusdr
Copy link
Contributor

@mariusdr mariusdr commented Mar 29, 2025

After builtin macro expansion in Preprocessor::ExpandBuiltinMacro the result token may have the Token::NeedsCleaning flag set which causes an assertion failure later on when the lexer retrieves the spelling of the token in getSpellingSlow.

This commit adds an Tok.clearFlag(Token::NeedsCleaning) call to the end of ExpandBuiltinMacro.

Closes #128384

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 29, 2025

@llvm/pr-subscribers-clang

Author: marius doerner (mariusdr)

Changes

After builtin macro expansion in Preprocessor::ExpandBuiltinMacro the result token may have the Token::NeedsCleaning flag set which causes an assertion failure later on when the lexer retrieves the spelling of the token in getSpellingSlow.

This commit moves the Tok.clearFlag(Token::NeedsCleaning) call to the end of ExpandBuiltinMacro.

Closes #128384


Full diff: https://github.com/llvm/llvm-project/pull/133574.diff

6 Files Affected:

  • (modified) clang/lib/Lex/PPMacroExpansion.cpp (+1-1)
  • (modified) clang/test/Preprocessor/embed___has_embed.c (+19)
  • (modified) clang/test/Preprocessor/has_attribute.c (+20)
  • (modified) clang/test/Preprocessor/has_attribute.cpp (+20)
  • (modified) clang/test/Preprocessor/has_c_attribute.c (+20)
  • (modified) clang/test/Preprocessor/has_include.c (+49)
diff --git a/clang/lib/Lex/PPMacroExpansion.cpp b/clang/lib/Lex/PPMacroExpansion.cpp
index 8e35d56d3f1a6..107d5c31e039d 100644
--- a/clang/lib/Lex/PPMacroExpansion.cpp
+++ b/clang/lib/Lex/PPMacroExpansion.cpp
@@ -1646,7 +1646,6 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
 
   // Set up the return result.
   Tok.setIdentifierInfo(nullptr);
-  Tok.clearFlag(Token::NeedsCleaning);
   bool IsAtStartOfLine = Tok.isAtStartOfLine();
   bool HasLeadingSpace = Tok.hasLeadingSpace();
 
@@ -2089,6 +2088,7 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
   CreateString(OS.str(), Tok, Tok.getLocation(), Tok.getLocation());
   Tok.setFlagValue(Token::StartOfLine, IsAtStartOfLine);
   Tok.setFlagValue(Token::LeadingSpace, HasLeadingSpace);
+  Tok.clearFlag(Token::NeedsCleaning);
 }
 
 void Preprocessor::markMacroAsUsed(MacroInfo *MI) {
diff --git a/clang/test/Preprocessor/embed___has_embed.c b/clang/test/Preprocessor/embed___has_embed.c
index 43a3068b5f53a..2705b5ef6fd5b 100644
--- a/clang/test/Preprocessor/embed___has_embed.c
+++ b/clang/test/Preprocessor/embed___has_embed.c
@@ -58,3 +58,22 @@ unsigned char buffer[] = {
 #else
 #error 17
 #endif
+
+#if __has_embed(__FILE__\
+)
+#else
+#error 18
+#endif
+
+#define F __FI\
+LE__
+#if __has_embed(F)
+#else
+#error 19
+#endif
+
+#if __has_embed(F\
+)
+#else 
+#error 20
+#endif
diff --git a/clang/test/Preprocessor/has_attribute.c b/clang/test/Preprocessor/has_attribute.c
index 0ba664a53e649..6f6f519bd299d 100644
--- a/clang/test/Preprocessor/has_attribute.c
+++ b/clang/test/Preprocessor/has_attribute.c
@@ -68,3 +68,23 @@ int has_no_volatile_attribute();
 int has_fallthrough;
 #endif
 // CHECK: int has_fallthrough;
+
+#if __has_attribute(F\
+)
+int has_fallthrough_2;
+#endif
+// CHECK: int has_fallthrough_2;
+
+#define F_2 fall\
+through
+
+#if __has_attribute(F_2)
+int has_fallthrough_3;
+#endif
+// CHECK: int has_fallthrough_3;
+
+#if __has_attribute(F_2\
+)
+int has_fallthrough_4;
+#endif
+// CHECK: int has_fallthrough_4;
diff --git a/clang/test/Preprocessor/has_attribute.cpp b/clang/test/Preprocessor/has_attribute.cpp
index 00ec57615c84b..72af6de27e8bb 100644
--- a/clang/test/Preprocessor/has_attribute.cpp
+++ b/clang/test/Preprocessor/has_attribute.cpp
@@ -116,6 +116,26 @@ int funclike_1;
 int funclike_2;
 #endif
 // CHECK: int funclike_2;
+
+#if __has_cpp_attribute(CF\
+)
+int has_clang_falthrough_5;
+#endif
+// CHECK: int has_clang_falthrough_5;
+
+#define CF_2 clang::\
+fallthrough
+
+#if __has_cpp_attribute(CF_2)
+int has_clang_falthrough_6;
+#endif
+// CHECK: int has_clang_falthrough_6;
+
+#if __has_cpp_attribute(CF_2\
+)
+int has_clang_falthrough_7;
+#endif
+// CHECK: int has_clang_falthrough_7;
 }
 
 // Test for Microsoft __declspec attributes
diff --git a/clang/test/Preprocessor/has_c_attribute.c b/clang/test/Preprocessor/has_c_attribute.c
index 3332571d758c8..d8be13f5898a9 100644
--- a/clang/test/Preprocessor/has_c_attribute.c
+++ b/clang/test/Preprocessor/has_c_attribute.c
@@ -84,3 +84,23 @@ int funclike_1;
 int funclike_2;
 #endif
 // CHECK: int funclike_2;
+
+#if __has_c_attribute(CL\
+)
+int has_clang_likely_5;
+#endif
+// CHECK: int has_clang_likely_5;
+
+#define CL_2 clang::\
+likely
+
+#if __has_c_attribute(CL_2)
+int has_clang_likely_6;
+#endif
+// CHECK: int has_clang_likely_6;
+
+#if __has_c_attribute(CL_2\
+)
+int has_clang_likely_7;
+#endif
+// CHECK: int has_clang_likely_7;
diff --git a/clang/test/Preprocessor/has_include.c b/clang/test/Preprocessor/has_include.c
index c95025d83860a..ff199bf23063f 100644
--- a/clang/test/Preprocessor/has_include.c
+++ b/clang/test/Preprocessor/has_include.c
@@ -197,3 +197,52 @@ __has_include
 #ifdef FOO
 #elif __has_include(<foo>)
 #endif
+
+#if __has_include(<stdint.h>\
+)
+#else
+  #error "__has_include failed (10)."
+#endif
+
+#define MACRO6 <stdint.h>
+#if __has_include(MACRO6\
+)
+#else
+  #error "__has_include failed (11)."
+#endif
+
+#if __has_include_next(<stdint.h>/*expected-warning {{#include_next in primary source file}}*/\
+)
+#else
+  #error "__has_include_next failed (9)."
+#endif
+
+#if __has_include_next(MACRO6/*expected-warning {{#include_next in primary source file}}*/\
+) 
+#else
+  #error "__has_include_next failed (10)."
+#endif
+
+#define MACRO7 <std\
+int.h>
+#if __has_include(MACRO7)
+#else
+  #error "__has_include failed (12)."
+#endif
+
+#if __has_include(MACRO7\
+)
+#else
+  #error "__has_include failed (13)."
+#endif
+
+#if __has_include_next(MACRO7) //expected-warning {{#include_next in primary source file}}
+#else
+  #error "__has_include_next failed (11)."
+#endif
+
+#if __has_include_next(MACRO7/*expected-warning {{#include_next in primary source file}}*/\
+) 
+#else
+  #error "__has_include_next failed (12)."
+#endif

Copy link

⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.
Please turn off Keep my email addresses private setting in your account.
See LLVM Discourse for more information.

@mariusdr mariusdr force-pushed the clang_lexer_clear_needscleaning_flag_after_expand_builtin_macro branch from ae2279b to dd54aa5 Compare March 29, 2025 08:42
@@ -1646,7 +1646,6 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {

// Set up the return result.
Tok.setIdentifierInfo(nullptr);
Tok.clearFlag(Token::NeedsCleaning);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are early returns between where we used to clear the needs cleaning flag and where we set it now; doesn't this mean we'll leave the flag in "needs cleaning" state in circumstances where it won't need cleaning (e.g., when dealing with __DATE__)?

Copy link
Contributor Author

@mariusdr mariusdr Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, my bad. If the flag is removed we can get asserts on __DATE__ and __TIME__ expansion. I've put it back and added some test cases to account for this (also for some similar macros like __LINE__, __FILE__ which do not early return but are easily tested in the same way).

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally LGTM, though please add a release note to clang/docs/ReleaseNotes.rst. I'd appreciate @cor3ntin taking a look as well before landing, just so we get another set of eyes on the changes.

@mariusdr
Copy link
Contributor Author

mariusdr commented Apr 2, 2025

This generally LGTM, though please add a release note to clang/docs/ReleaseNotes.rst. I'd appreciate @cor3ntin taking a look as well before landing, just so we get another set of eyes on the changes.

Thanks, added release notes under "Bug Fixes in this Version".

mariusdr added 3 commits April 2, 2025 19:42
After builtin macro expansion in `Preprocessor::ExpandBuiltinMacro`
the result token may have the `Token::NeedsCleaning` flag set
which causes an assertion failure later on when the lexer retrives
the spelling of the token in `getSpellingSlow` in Lexer.cpp.
at the beginning of ExpandBuiltinMacro to account for early returns
(e.g..in __DATE__ or __TIME__ expansion).
@mariusdr mariusdr force-pushed the clang_lexer_clear_needscleaning_flag_after_expand_builtin_macro branch from 445336a to 686c085 Compare April 2, 2025 17:43
@AaronBallman AaronBallman requested a review from cor3ntin April 3, 2025 11:46
Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - sorry for the delayed response.

Will you need me to merge that for you (once the merge conflict is resolved)?

@mariusdr
Copy link
Contributor Author

LGTM - sorry for the delayed response.

Will you need me to merge that for you (once the merge conflict is resolved)?

Yes, thanks!

@cor3ntin cor3ntin merged commit 9a1ece2 into llvm:main Apr 15, 2025
9 of 11 checks passed
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…33574)

After builtin macro expansion in `Preprocessor::ExpandBuiltinMacro` the
result token may have the `Token::NeedsCleaning` flag set which causes
an assertion failure later on when the lexer retrieves the spelling of
the token in `getSpellingSlow`.

This commit adds an `Tok.clearFlag(Token::NeedsCleaning)` call to the
end of `ExpandBuiltinMacro`.

Closes llvm#128384
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

__has_embed: Assertion `Length < Tok.getLength() && "NeedsCleaning flag set on token that didn't need cleaning!"' failed.
4 participants