-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[clang-format] Stop crashing on slightly off Verilog module headers #116000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-format Author: None (sstwcw) ChangesThis piece of code made the program crash. function pkg::t get
(int t = 2,
int f = 2); The way the code is supposed to be parsed is that UnwrappedLineParser should identify the function header, and then TokenAnnotator should recognize the result. But the code in UnwrappedLineParser would mistakenly not recognize it due to the The code in UnwrappedLineParser now recognizes the The are other cases in which TokenAnnotator would recognize the comma as both of those types, for example if the Full diff: https://github.com/llvm/llvm-project/pull/116000.diff 4 Files Affected:
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index 269cbef2720792..bc5239209f3aab 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -1554,7 +1554,7 @@ class AnnotatingParser {
};
if (IsInstancePort())
- Tok->setFinalizedType(TT_VerilogInstancePortLParen);
+ Tok->setType(TT_VerilogInstancePortLParen);
}
if (!parseParens())
@@ -1730,7 +1730,7 @@ class AnnotatingParser {
Tok->setType(TT_InheritanceComma);
break;
case Context::VerilogInstancePortList:
- Tok->setFinalizedType(TT_VerilogInstancePortComma);
+ Tok->setType(TT_VerilogInstancePortComma);
break;
default:
if (Style.isVerilog() && Contexts.size() == 1 &&
diff --git a/clang/lib/Format/UnwrappedLineParser.cpp b/clang/lib/Format/UnwrappedLineParser.cpp
index 5f1dd38ef1eb3b..e3b96b19ba0318 100644
--- a/clang/lib/Format/UnwrappedLineParser.cpp
+++ b/clang/lib/Format/UnwrappedLineParser.cpp
@@ -4441,7 +4441,8 @@ unsigned UnwrappedLineParser::parseVerilogHierarchyHeader() {
Prev->setFinalizedType(TT_VerilogDimensionedTypeName);
parseSquare();
} else if (Keywords.isVerilogIdentifier(*FormatTok) ||
- FormatTok->isOneOf(Keywords.kw_automatic, tok::kw_static)) {
+ FormatTok->isOneOf(tok::coloncolon, Keywords.kw_automatic,
+ tok::kw_static)) {
nextToken();
} else {
break;
diff --git a/clang/unittests/Format/FormatTestVerilog.cpp b/clang/unittests/Format/FormatTestVerilog.cpp
index 49d276fc78d81b..2bc3887af5c1db 100644
--- a/clang/unittests/Format/FormatTestVerilog.cpp
+++ b/clang/unittests/Format/FormatTestVerilog.cpp
@@ -702,6 +702,11 @@ TEST_F(FormatTestVerilog, Hierarchy) {
" generate\n"
" endgenerate\n"
"endfunction : x");
+ // Type names with '::' should be recognized.
+ verifyFormat("function automatic x::x x\n"
+ " (input x);\n"
+ "endfunction : x");
+ verifyNoCrash("x x(x x, x x);");
}
TEST_F(FormatTestVerilog, Identifiers) {
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp
index bb8ee416ea2db5..e88b8ea4894657 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -2598,6 +2598,14 @@ TEST_F(TokenAnnotatorTest, UnderstandsVerilogOperators) {
Tokens = Annotate("x = '{\"\"};");
ASSERT_EQ(Tokens.size(), 8u) << Tokens;
EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
+
+ // Module headers.
+ Tokens = Annotate("module x();\nendmodule");
+ ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+ EXPECT_TOKEN(Tokens[2], tok::l_paren, TT_VerilogMultiLineListLParen);
+ Tokens = Annotate("function automatic x::x x();\nendmodule");
+ ASSERT_EQ(Tokens.size(), 11u) << Tokens;
+ EXPECT_TOKEN(Tokens[6], tok::l_paren, TT_VerilogMultiLineListLParen);
}
TEST_F(TokenAnnotatorTest, UnderstandTableGenTokens) {
|
…lvm#116000) This piece of code made the program crash. ```Verilog function pkg::t get (int t = 2, int f = 2); ``` The way the code is supposed to be parsed is that UnwrappedLineParser should identify the function header, and then TokenAnnotator should recognize the result. But the code in UnwrappedLineParser would mistakenly not recognize it due to the `::`. Then TokenAnnotator would recognize the comma both as TT_VerilogInstancePortComma and TT_VerilogTypeComma. The code for annotating the instance port comma used `setFinalizedType`. The program would crash when it tried to set it to another type. The code in UnwrappedLineParser now recognizes the `::` token. The are other cases in which TokenAnnotator would recognize the comma as both of those types, for example if the `function` keyword is removed. The type is now set using `setType` instead so that the program does not crash. The developer no longer knows why he used `setFinalizedType` back then.
88547bc
to
0ff8b79
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/51/builds/6584 Here is the relevant piece of the build log for the reference
|
@@ -2598,6 +2598,20 @@ TEST_F(TokenAnnotatorTest, UnderstandsVerilogOperators) { | |||
Tokens = Annotate("x = '{\"\"};"); | |||
ASSERT_EQ(Tokens.size(), 8u) << Tokens; | |||
EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown); | |||
|
|||
// Module headers. | |||
Tokens = Annotate("module x();\nendmodule"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and other testcases below) should be broken up into two lines:
Tokens = Annotate("module x();\n"
"endmodule");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in aa2d084.
This piece of code made the program crash.
The way the code is supposed to be parsed is that UnwrappedLineParser should identify the function header, and then TokenAnnotator should recognize the result. But the code in UnwrappedLineParser would mistakenly not recognize it due to the
::
. Then TokenAnnotator would recognize the comma both as TT_VerilogInstancePortComma and TT_VerilogTypeComma. The code for annotating the instance port comma usedsetFinalizedType
. The program would crash when it tried to set it to another type.The code in UnwrappedLineParser now recognizes the
::
token.The are other cases in which TokenAnnotator would recognize the comma as both of those types, for example if the
function
keyword is removed. The type is now set usingsetType
instead so that the program does not crash. The developer no longer knows why he usedsetFinalizedType
back then.