Skip to content

[clang-format]: Annotate colons found in inline assembly #92617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 7, 2024

Conversation

gedare
Copy link
Contributor

@gedare gedare commented May 17, 2024

Short-circuit the parsing of tok::colon to label colons found within lines starting with asm as InlineASMColon.

Fixes #92616.

Short-circuit the parsing of tok::colon to label colons found
within lines starting with asm as InlineASMColon.

Fixes llvm#92616.
@llvmbot
Copy link
Member

llvmbot commented May 17, 2024

@llvm/pr-subscribers-clang-format

Author: Gedare Bloom (gedare)

Changes

Short-circuit the parsing of tok::colon to label colons found within lines starting with asm as InlineASMColon.

Fixes #92616.


Full diff: https://github.com/llvm/llvm-project/pull/92617.diff

2 Files Affected:

  • (modified) clang/lib/Format/TokenAnnotator.cpp (+2)
  • (modified) clang/unittests/Format/TokenAnnotatorTest.cpp (+17-3)
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index 7c4c76a91f2c5..a83f937336565 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -1358,6 +1358,8 @@ class AnnotatingParser {
           Line.First->startsSequence(tok::kw_export, Keywords.kw_module) ||
           Line.First->startsSequence(tok::kw_export, Keywords.kw_import)) {
         Tok->setType(TT_ModulePartitionColon);
+      } else if (Line.First->is(tok::kw_asm)) {
+        Tok->setType(TT_InlineASMColon);
       } else if (Contexts.back().ColonIsDictLiteral || Style.isProto()) {
         Tok->setType(TT_DictLiteral);
         if (Style.Language == FormatStyle::LK_TextProto) {
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp
index aadfa6dc0165c..0f5f3936e0181 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1489,11 +1489,25 @@ TEST_F(TokenAnnotatorTest, RequiresDoesNotChangeParsingOfTheRest) {
 TEST_F(TokenAnnotatorTest, UnderstandsAsm) {
   auto Tokens = annotate("__asm{\n"
                          "a:\n"
-                         "};");
-  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+                         ": x\n"
+                         ":};");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
   EXPECT_TOKEN(Tokens[0], tok::kw_asm, TT_Unknown);
   EXPECT_TOKEN(Tokens[1], tok::l_brace, TT_InlineASMBrace);
-  EXPECT_TOKEN(Tokens[4], tok::r_brace, TT_InlineASMBrace);
+  EXPECT_TOKEN(Tokens[3], tok::colon, TT_InlineASMColon);
+  EXPECT_TOKEN(Tokens[4], tok::colon, TT_InlineASMColon);
+  EXPECT_TOKEN(Tokens[6], tok::colon, TT_InlineASMColon);
+  EXPECT_TOKEN(Tokens[7], tok::r_brace, TT_InlineASMBrace);
+
+  Tokens = annotate("__asm__ volatile (\n"
+                    "a:\n"
+                    ": x\n"
+                    ":);");
+  ASSERT_EQ(Tokens.size(), 11u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::kw_asm, TT_Unknown);
+  EXPECT_TOKEN(Tokens[4], tok::colon, TT_InlineASMColon);
+  EXPECT_TOKEN(Tokens[5], tok::colon, TT_InlineASMColon);
+  EXPECT_TOKEN(Tokens[7], tok::colon, TT_InlineASMColon);
 }
 
 TEST_F(TokenAnnotatorTest, UnderstandsObjCBlock) {

@gedare
Copy link
Contributor Author

gedare commented May 17, 2024

With the patch:

$ echo " asm ( a : );" | clang-format -debug
...
 M=0 C=1 T=InlineASMColon S=1 F=0 B=0 BK=0 P=43 Name=colon L=7 PPK=2 FakeLParens= FakeRParens=1 II=0x0 Text=':'
...
$ echo " asm { a : };" | clang-format -debug
...
 M=0 C=1 T=InlineASMColon S=0 F=1 B=0 BK=0 P=43 Name=colon L=7 PPK=2 FakeLParens= FakeRParens=1 II=0x0 Text=':'

@gedare gedare requested a review from owenca May 24, 2024 17:56
Copy link
Contributor

@owenca owenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that you used 5 different assembly snippets and repeated each 3-6 times. Instead, we can have something like the following:

asm{Snippet 1};

__asm(Snippet 2);

__asm__(Snippet 3);

asm volatile (Snippet 4);

__asm volatile (Snippet 5);

Also, no space between tok::kw_asm (i.e. asm, __asm, or __asm__) and (.

Comment on lines 1490 to +1493
auto Tokens = annotate("__asm{\n"
"a:\n"
"};");
ASSERT_EQ(Tokens.size(), 7u) << Tokens;
"\"a\":\n"
": x\n"
":};");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the unquoted a: in assembly code invalid? If not, please add your test instead of editing the existing one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK it's not valid, unless a is a CPP string.

@gedare
Copy link
Contributor Author

gedare commented May 25, 2024

It seems that you used 5 different assembly snippets and repeated each 3-6 times. Instead, we can have something like the following:

asm{Snippet 1};

__asm(Snippet 2);

__asm__(Snippet 3);

asm volatile (Snippet 4);

__asm volatile (Snippet 5);

Also, no space between tok::asm (i.e. asm, __asm, or __asm__) and (.

done, thanks for the clarification

@owenca owenca merged commit a10135f into llvm:main Jun 7, 2024
8 checks passed
@gedare gedare deleted the objc-parse-asm branch June 20, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang-format]: token annotator confuses inline asm colon as ObjCMethodExpr or DictLiteral
5 participants