Skip to content

[clang][refactor] Refactor findNextTokenIncludingComments #123060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 3 additions & 31 deletions clang-tools-extra/clang-reorder-fields/ReorderFieldsAction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -118,47 +118,19 @@ findMembersUsedInInitExpr(const CXXCtorInitializer *Initializer,
return Results;
}

/// Returns the next token after `Loc` (including comment tokens).
static std::optional<Token> getTokenAfter(SourceLocation Loc,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like just a better formatted findNextTokenIncludingComments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why we wanted to get rid of that code, it does not really make sense to have 3 quasi-identical copies of the same code.

const SourceManager &SM,
const LangOptions &LangOpts) {
if (Loc.isMacroID()) {
return std::nullopt;
}
Loc = Lexer::getLocForEndOfToken(Loc, 0, SM, LangOpts);

// Break down the source location.
std::pair<FileID, unsigned> LocInfo = SM.getDecomposedLoc(Loc);

// Try to load the file buffer.
bool InvalidTemp = false;
StringRef File = SM.getBufferData(LocInfo.first, &InvalidTemp);
if (InvalidTemp)
return std::nullopt;

const char *TokenBegin = File.data() + LocInfo.second;

Lexer lexer(SM.getLocForStartOfFile(LocInfo.first), LangOpts, File.begin(),
TokenBegin, File.end());
lexer.SetCommentRetentionState(true);
// Find the token.
Token Tok;
lexer.LexFromRawLexer(Tok);
return Tok;
}

/// Returns the end of the trailing comments after `Loc`.
static SourceLocation getEndOfTrailingComment(SourceLocation Loc,
const SourceManager &SM,
const LangOptions &LangOpts) {
// We consider any following comment token that is indented more than the
// first comment to be part of the trailing comment.
const unsigned Column = SM.getPresumedColumnNumber(Loc);
std::optional<Token> Tok = getTokenAfter(Loc, SM, LangOpts);
std::optional<Token> Tok =
Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true);
Comment on lines +128 to +129
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::optional<Token> Tok =
Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true);
std::optional<Token> Tok =
findNextTokenIncludingComments(Loc, SM, LangOpts);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, right?

while (Tok && Tok->is(tok::comment) &&
SM.getPresumedColumnNumber(Tok->getLocation()) > Column) {
Loc = Tok->getEndLoc();
Tok = getTokenAfter(Loc, SM, LangOpts);
Tok = Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Tok = Lexer::findNextToken(Loc, SM, LangOpts, /*IncludeComments=*/true);
Tok = findNextTokenIncludingComments(Loc, SM, LangOpts);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think clang-reorder-fields is supposed to depend on stuff from clang-tidy ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i think it's not worth it (adding a dependency on clang-tidy for the sake of findNextTokenIncludingComments seems like overkill).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed actually, thanks

}
return Loc;
}
Expand Down
23 changes: 0 additions & 23 deletions clang-tools-extra/clang-tidy/utils/LexerUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,29 +86,6 @@ SourceLocation findNextTerminator(SourceLocation Start, const SourceManager &SM,
return findNextAnyTokenKind(Start, SM, LangOpts, tok::comma, tok::semi);
}

std::optional<Token>
findNextTokenIncludingComments(SourceLocation Start, const SourceManager &SM,
const LangOptions &LangOpts) {
// `Lexer::findNextToken` will ignore comment
if (Start.isMacroID())
return std::nullopt;
Start = Lexer::getLocForEndOfToken(Start, 0, SM, LangOpts);
// Break down the source location.
std::pair<FileID, unsigned> LocInfo = SM.getDecomposedLoc(Start);
bool InvalidTemp = false;
StringRef File = SM.getBufferData(LocInfo.first, &InvalidTemp);
if (InvalidTemp)
return std::nullopt;
// Lex from the start of the given location.
Lexer L(SM.getLocForStartOfFile(LocInfo.first), LangOpts, File.begin(),
File.data() + LocInfo.second, File.end());
L.SetCommentRetentionState(true);
// Find the token.
Token Tok;
L.LexFromRawLexer(Tok);
return Tok;
}

std::optional<Token>
findNextTokenSkippingComments(SourceLocation Start, const SourceManager &SM,
const LangOptions &LangOpts) {
Expand Down
6 changes: 4 additions & 2 deletions clang-tools-extra/clang-tidy/utils/LexerUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,11 @@ SourceLocation findNextAnyTokenKind(SourceLocation Start,
}
}

std::optional<Token>
inline std::optional<Token>
findNextTokenIncludingComments(SourceLocation Start, const SourceManager &SM,
const LangOptions &LangOpts);
const LangOptions &LangOpts) {
return Lexer::findNextToken(Start, SM, LangOpts, true);
}

// Finds next token that's not a comment.
std::optional<Token> findNextTokenSkippingComments(SourceLocation Start,
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Lex/Lexer.h
Original file line number Diff line number Diff line change
Expand Up @@ -554,7 +554,8 @@ class Lexer : public PreprocessorLexer {
/// Returns the next token, or std::nullopt if the location is inside a macro.
static std::optional<Token> findNextToken(SourceLocation Loc,
const SourceManager &SM,
const LangOptions &LangOpts);
const LangOptions &LangOpts,
bool IncludeComments = false);

/// Checks that the given token is the first token that occurs after
/// the given location (this excludes comments and whitespace). Returns the
Expand Down
4 changes: 3 additions & 1 deletion clang/lib/Lex/Lexer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1323,7 +1323,8 @@ const char *Lexer::SkipEscapedNewLines(const char *P) {

std::optional<Token> Lexer::findNextToken(SourceLocation Loc,
const SourceManager &SM,
const LangOptions &LangOpts) {
const LangOptions &LangOpts,
bool IncludeComments) {
if (Loc.isMacroID()) {
if (!Lexer::isAtEndOfMacroExpansion(Loc, SM, LangOpts, &Loc))
return std::nullopt;
Expand All @@ -1344,6 +1345,7 @@ std::optional<Token> Lexer::findNextToken(SourceLocation Loc,
// Lex from the start of the given location.
Lexer lexer(SM.getLocForStartOfFile(LocInfo.first), LangOpts, File.begin(),
TokenBegin, File.end());
lexer.SetCommentRetentionState(IncludeComments);
// Find the token.
Token Tok;
lexer.LexFromRawLexer(Tok);
Expand Down
21 changes: 21 additions & 0 deletions clang/unittests/Lex/LexerTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -603,6 +603,7 @@ TEST_F(LexerTest, CharRangeOffByOne) {

TEST_F(LexerTest, FindNextToken) {
Lex("int abcd = 0;\n"
"// A comment.\n"
"int xyz = abcd;\n");
std::vector<std::string> GeneratedByNextToken;
SourceLocation Loc =
Expand All @@ -619,6 +620,26 @@ TEST_F(LexerTest, FindNextToken) {
"xyz", "=", "abcd", ";"));
}

TEST_F(LexerTest, FindNextTokenIncludingComments) {
Lex("int abcd = 0;\n"
"// A comment.\n"
"int xyz = abcd;\n");
std::vector<std::string> GeneratedByNextToken;
SourceLocation Loc =
SourceMgr.getLocForStartOfFile(SourceMgr.getMainFileID());
while (true) {
auto T = Lexer::findNextToken(Loc, SourceMgr, LangOpts, true);
ASSERT_TRUE(T);
if (T->is(tok::eof))
break;
GeneratedByNextToken.push_back(getSourceText(*T, *T));
Loc = T->getLocation();
}
EXPECT_THAT(GeneratedByNextToken,
ElementsAre("abcd", "=", "0", ";", "// A comment.", "int", "xyz",
"=", "abcd", ";"));
}

TEST_F(LexerTest, CreatedFIDCountForPredefinedBuffer) {
TrivialModuleLoader ModLoader;
auto PP = CreatePP("", ModLoader);
Expand Down
Loading