-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][refactor] Refactor findNextTokenIncludingComments
#123060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-tools-extra @llvm/pr-subscribers-clang-tidy Author: Clement Courbet (legrosbuffle) ChangesWe have two copies of the same code in clang-tidy and clang-reorder-fields, and those are extremenly similar to Full diff: https://github.com/llvm/llvm-project/pull/123060.diff 7 Files Affected:
diff --git a/clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp b/clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp
index 80ee31368fe9a5..40c96f92254e42 100644
--- a/clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp
+++ b/clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp
@@ -118,35 +118,6 @@ findMembersUsedInInitExpr(const CXXCtorInitializer *Initializer,
return Results;
}
-/// Returns the next token after `Loc` (including comment tokens).
-static std::optional<Token> getTokenAfter(SourceLocation Loc,
- const SourceManager &SM,
- const LangOptions &LangOpts) {
- if (Loc.isMacroID()) {
- return std::nullopt;
- }
- Loc = Lexer::getLocForEndOfToken(Loc, 0, SM, LangOpts);
-
- // Break down the source location.
- std::pair<FileID, unsigned> LocInfo = SM.getDecomposedLoc(Loc);
-
- // Try to load the file buffer.
- bool InvalidTemp = false;
- StringRef File = SM.getBufferData(LocInfo.first, &InvalidTemp);
- if (InvalidTemp)
- return std::nullopt;
-
- const char *TokenBegin = File.data() + LocInfo.second;
-
- Lexer lexer(SM.getLocForStartOfFile(LocInfo.first), LangOpts, File.begin(),
- TokenBegin, File.end());
- lexer.SetCommentRetentionState(true);
- // Find the token.
- Token Tok;
- lexer.LexFromRawLexer(Tok);
- return Tok;
-}
-
/// Returns the end of the trailing comments after `Loc`.
static SourceLocation getEndOfTrailingComment(SourceLocation Loc,
const SourceManager &SM,
@@ -154,11 +125,12 @@ static SourceLocation getEndOfTrailingComment(SourceLocation Loc,
// We consider any following comment token that is indented more than the
// first comment to be part of the trailing comment.
const unsigned Column = SM.getPresumedColumnNumber(Loc);
- std::optional<Token> Tok = getTokenAfter(Loc, SM, LangOpts);
+ std::optional<Token> Tok =
+ Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments*/ true);
while (Tok && Tok->is(tok::comment) &&
SM.getPresumedColumnNumber(Tok->getLocation()) > Column) {
Loc = Tok->getEndLoc();
- Tok = getTokenAfter(Loc, SM, LangOpts);
+ Tok = Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments*/ true);
}
return Loc;
}
diff --git a/clang-tools-extra/clang-tidy/modernize/ConcatNestedNamespacesCheck.cpp b/clang-tools-extra/clang-tidy/modernize/ConcatNestedNamespacesCheck.cpp
index d24b613015d8ee..b4f54d02fc3362 100644
--- a/clang-tools-extra/clang-tidy/modernize/ConcatNestedNamespacesCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/ConcatNestedNamespacesCheck.cpp
@@ -59,7 +59,7 @@ SourceRange NS::getNamespaceBackRange(const SourceManager &SM,
// Back from '}' to conditional '// namespace xxx'
SourceLocation Loc = front()->getRBraceLoc();
std::optional<Token> Tok =
- utils::lexer::findNextTokenIncludingComments(Loc, SM, LangOpts);
+ Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments*/ true);
if (!Tok)
return getDefaultNamespaceBackRange();
if (Tok->getKind() != tok::TokenKind::comment)
diff --git a/clang-tools-extra/clang-tidy/modernize/UseOverrideCheck.cpp b/clang-tools-extra/clang-tidy/modernize/UseOverrideCheck.cpp
index fd5bd9f0b181b1..6191ebfbfb01f0 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseOverrideCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/UseOverrideCheck.cpp
@@ -229,9 +229,8 @@ void UseOverrideCheck::check(const MatchFinder::MatchResult &Result) {
if (HasVirtual) {
for (Token Tok : Tokens) {
if (Tok.is(tok::kw_virtual)) {
- std::optional<Token> NextToken =
- utils::lexer::findNextTokenIncludingComments(
- Tok.getEndLoc(), Sources, getLangOpts());
+ std::optional<Token> NextToken = Lexer::findNextToken(
+ Tok.getEndLoc(), Sources, getLangOpts(), /*IncludeComments*/ true);
if (NextToken.has_value()) {
Diag << FixItHint::CreateRemoval(CharSourceRange::getCharRange(
Tok.getLocation(), NextToken->getLocation()));
diff --git a/clang-tools-extra/clang-tidy/utils/LexerUtils.cpp b/clang-tools-extra/clang-tidy/utils/LexerUtils.cpp
index 92c3e0ed7894e1..50da196315d3b3 100644
--- a/clang-tools-extra/clang-tidy/utils/LexerUtils.cpp
+++ b/clang-tools-extra/clang-tidy/utils/LexerUtils.cpp
@@ -86,29 +86,6 @@ SourceLocation findNextTerminator(SourceLocation Start, const SourceManager &SM,
return findNextAnyTokenKind(Start, SM, LangOpts, tok::comma, tok::semi);
}
-std::optional<Token>
-findNextTokenIncludingComments(SourceLocation Start, const SourceManager &SM,
- const LangOptions &LangOpts) {
- // `Lexer::findNextToken` will ignore comment
- if (Start.isMacroID())
- return std::nullopt;
- Start = Lexer::getLocForEndOfToken(Start, 0, SM, LangOpts);
- // Break down the source location.
- std::pair<FileID, unsigned> LocInfo = SM.getDecomposedLoc(Start);
- bool InvalidTemp = false;
- StringRef File = SM.getBufferData(LocInfo.first, &InvalidTemp);
- if (InvalidTemp)
- return std::nullopt;
- // Lex from the start of the given location.
- Lexer L(SM.getLocForStartOfFile(LocInfo.first), LangOpts, File.begin(),
- File.data() + LocInfo.second, File.end());
- L.SetCommentRetentionState(true);
- // Find the token.
- Token Tok;
- L.LexFromRawLexer(Tok);
- return Tok;
-}
-
std::optional<Token>
findNextTokenSkippingComments(SourceLocation Start, const SourceManager &SM,
const LangOptions &LangOpts) {
diff --git a/clang-tools-extra/clang-tidy/utils/LexerUtils.h b/clang-tools-extra/clang-tidy/utils/LexerUtils.h
index ea9bd512b68b8f..c3f8721bca5ffa 100644
--- a/clang-tools-extra/clang-tidy/utils/LexerUtils.h
+++ b/clang-tools-extra/clang-tidy/utils/LexerUtils.h
@@ -89,10 +89,6 @@ SourceLocation findNextAnyTokenKind(SourceLocation Start,
}
}
-std::optional<Token>
-findNextTokenIncludingComments(SourceLocation Start, const SourceManager &SM,
- const LangOptions &LangOpts);
-
// Finds next token that's not a comment.
std::optional<Token> findNextTokenSkippingComments(SourceLocation Start,
const SourceManager &SM,
diff --git a/clang/include/clang/Lex/Lexer.h b/clang/include/clang/Lex/Lexer.h
index b6ecc7e5ded9e2..82a041ea3f848a 100644
--- a/clang/include/clang/Lex/Lexer.h
+++ b/clang/include/clang/Lex/Lexer.h
@@ -554,7 +554,8 @@ class Lexer : public PreprocessorLexer {
/// Returns the next token, or std::nullopt if the location is inside a macro.
static std::optional<Token> findNextToken(SourceLocation Loc,
const SourceManager &SM,
- const LangOptions &LangOpts);
+ const LangOptions &LangOpts,
+ bool IncludeComments = false);
/// Checks that the given token is the first token that occurs after
/// the given location (this excludes comments and whitespace). Returns the
diff --git a/clang/lib/Lex/Lexer.cpp b/clang/lib/Lex/Lexer.cpp
index 72364500a48f9f..115b6c1606a022 100644
--- a/clang/lib/Lex/Lexer.cpp
+++ b/clang/lib/Lex/Lexer.cpp
@@ -1323,7 +1323,8 @@ const char *Lexer::SkipEscapedNewLines(const char *P) {
std::optional<Token> Lexer::findNextToken(SourceLocation Loc,
const SourceManager &SM,
- const LangOptions &LangOpts) {
+ const LangOptions &LangOpts,
+ bool IncludeComments) {
if (Loc.isMacroID()) {
if (!Lexer::isAtEndOfMacroExpansion(Loc, SM, LangOpts, &Loc))
return std::nullopt;
@@ -1344,6 +1345,7 @@ std::optional<Token> Lexer::findNextToken(SourceLocation Loc,
// Lex from the start of the given location.
Lexer lexer(SM.getLocForStartOfFile(LocInfo.first), LangOpts, File.begin(),
TokenBegin, File.end());
+ lexer.SetCommentRetentionState(IncludeComments);
// Find the token.
Token Tok;
lexer.LexFromRawLexer(Tok);
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neat :)
utils::lexer::findNextTokenIncludingComments( | ||
Tok.getEndLoc(), Sources, getLangOpts()); | ||
std::optional<Token> NextToken = Lexer::findNextToken( | ||
Tok.getEndLoc(), Sources, getLangOpts(), /*IncludeComments*/ true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tok.getEndLoc(), Sources, getLangOpts(), /*IncludeComments*/ true); | |
Tok.getEndLoc(), Sources, getLangOpts(), /*IncludeComments=*/true); |
@@ -59,7 +59,7 @@ SourceRange NS::getNamespaceBackRange(const SourceManager &SM, | |||
// Back from '}' to conditional '// namespace xxx' | |||
SourceLocation Loc = front()->getRBraceLoc(); | |||
std::optional<Token> Tok = | |||
utils::lexer::findNextTokenIncludingComments(Loc, SM, LangOpts); | |||
Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments*/ true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments*/ true); | |
Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true); |
std::optional<Token> | ||
findNextTokenIncludingComments(SourceLocation Start, const SourceManager &SM, | ||
const LangOptions &LangOpts); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could keep that, and just use Lexer::findNextToken
as its implementation (it would avoid the /*IncludeComments=*/true
everywhere)
We have two copies of the same code in clang-tidy and clang-reorder-fields, and those are extremenly similar to `Lexer::findNextToken`, so just add an extra agument to the latter.
46d06ab
to
c1f4603
Compare
Thanks, and sorry I had to force-push because github would not refresh the PR (https://github.com/orgs/community/discussions/78775) |
Apply suggestions Co-authored-by: cor3ntin <[email protected]>
std::optional<Token> Tok = | ||
Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::optional<Token> Tok = | |
Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true); | |
std::optional<Token> Tok = | |
findNextTokenIncludingComments(Loc, SM, LangOpts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that, right?
while (Tok && Tok->is(tok::comment) && | ||
SM.getPresumedColumnNumber(Tok->getLocation()) > Column) { | ||
Loc = Tok->getEndLoc(); | ||
Tok = getTokenAfter(Loc, SM, LangOpts); | ||
Tok = Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tok = Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true); | |
Tok = findNextTokenIncludingComments(Loc, SM, LangOpts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think clang-reorder-fields is supposed to depend on stuff from clang-tidy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think it's not worth it (adding a dependency on clang-tidy for the sake of findNextTokenIncludingComments seems like overkill).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed actually, thanks
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lexer::findNextToken
has an additional check:
if (!Lexer::isAtEndOfMacroExpansion(Loc, SM, LangOpts, &Loc))
Does this not change behavior in the case we are replacing? I imagine folks just copied from Lexer::findNextToken
and purposely left this out, is that wrong?
@@ -118,47 +118,19 @@ findMembersUsedInInitExpr(const CXXCtorInitializer *Initializer, | |||
return Results; | |||
} | |||
|
|||
/// Returns the next token after `Loc` (including comment tokens). | |||
static std::optional<Token> getTokenAfter(SourceLocation Loc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like just a better formatted findNextTokenIncludingComments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's why we wanted to get rid of that code, it does not really make sense to have 3 quasi-identical copies of the same code.
I think the intent was clearly to copy the whole thing, the original author states: "This part of code copy from Lexer::findNextToken except this line. " source "this line" referring to |
We have two copies of the same code in clang-tidy and clang-reorder-fields, and those are extremenly similar to
Lexer::findNextToken
, so just add an extra agument to the latter.