Skip to content

[clang-format] Add TemplateNames option to help parse C++ angles #109916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 3, 2024

Conversation

owenca
Copy link
Contributor

@owenca owenca commented Sep 25, 2024

Closes #109912.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang-format labels Sep 25, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 25, 2024

@llvm/pr-subscribers-clang-format

Author: Owen Pan (owenca)

Changes

Closes #109912.


Full diff: https://github.com/llvm/llvm-project/pull/109916.diff

9 Files Affected:

  • (modified) clang/docs/ClangFormatStyleOptions.rst (+9)
  • (modified) clang/docs/ReleaseNotes.rst (+1)
  • (modified) clang/include/clang/Format/Format.h (+10)
  • (modified) clang/lib/Format/Format.cpp (+1)
  • (modified) clang/lib/Format/FormatToken.h (+1)
  • (modified) clang/lib/Format/FormatTokenLexer.cpp (+4)
  • (modified) clang/lib/Format/FormatTokenLexer.h (+1-1)
  • (modified) clang/lib/Format/TokenAnnotator.cpp (+20-16)
  • (modified) clang/unittests/Format/TokenAnnotatorTest.cpp (+17)
diff --git a/clang/docs/ClangFormatStyleOptions.rst b/clang/docs/ClangFormatStyleOptions.rst
index a427d7cd40fcdd..a16edb0989b05c 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -6554,6 +6554,15 @@ the configuration (without a prefix: ``Auto``).
     let DAGArgOtherID = (other i32:$other1, i32:$other2);
     let DAGArgBang = (!cast<SomeType>("Some") i32:$src1, i32:$src2)
 
+.. _TemplateNames:
+
+**TemplateNames** (``List of Strings``) :versionbadge:`clang-format 20` :ref:`¶ <TemplateNames>`
+  A vector of non-keyword identifiers that should be interpreted as
+  template names.
+
+  A ``<`` after a template name is annotated as a template opener instead of
+  a binary operator.
+
 .. _TypeNames:
 
 **TypeNames** (``List of Strings``) :versionbadge:`clang-format 17` :ref:`¶ <TypeNames>`
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index f4535db7356194..b86a40b7cc9407 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -532,6 +532,7 @@ clang-format
 ------------
 
 - Adds ``BreakBinaryOperations`` option.
+- Adds ``TemplateNames`` option.
 
 libclang
 --------
diff --git a/clang/include/clang/Format/Format.h b/clang/include/clang/Format/Format.h
index d8b62c7652a0f6..53a9577e0f72e7 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -4974,6 +4974,15 @@ struct FormatStyle {
   /// \version 3.7
   unsigned TabWidth;
 
+  /// A vector of non-keyword identifiers that should be interpreted as
+  /// template names.
+  ///
+  /// A ``<`` after a template name is annotated as a template opener instead of
+  /// a binary operator.
+  ///
+  /// \version 20
+  std::vector<std::string> TemplateNames;
+
   /// A vector of non-keyword identifiers that should be interpreted as type
   /// names.
   ///
@@ -5230,6 +5239,7 @@ struct FormatStyle {
            TableGenBreakingDAGArgOperators ==
                R.TableGenBreakingDAGArgOperators &&
            TableGenBreakInsideDAGArg == R.TableGenBreakInsideDAGArg &&
+           TabWidth == R.TabWidth && TemplateNames == R.TemplateNames &&
            TabWidth == R.TabWidth && TypeNames == R.TypeNames &&
            TypenameMacros == R.TypenameMacros && UseTab == R.UseTab &&
            VerilogBreakBetweenInstancePorts ==
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index d2463b892fbb96..5350c66ea5132b 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -1147,6 +1147,7 @@ template <> struct MappingTraits<FormatStyle> {
     IO.mapOptional("TableGenBreakInsideDAGArg",
                    Style.TableGenBreakInsideDAGArg);
     IO.mapOptional("TabWidth", Style.TabWidth);
+    IO.mapOptional("TemplateNames", Style.TemplateNames);
     IO.mapOptional("TypeNames", Style.TypeNames);
     IO.mapOptional("TypenameMacros", Style.TypenameMacros);
     IO.mapOptional("UseTab", Style.UseTab);
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index 03c0cbd60961a2..7d342a7dcca01d 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -179,6 +179,7 @@ namespace format {
   TYPE(TrailingReturnArrow)                                                    \
   TYPE(TrailingUnaryOperator)                                                  \
   TYPE(TypeDeclarationParen)                                                   \
+  TYPE(TemplateName)                                                           \
   TYPE(TypeName)                                                               \
   TYPE(TypenameMacro)                                                          \
   TYPE(UnaryOperator)                                                          \
diff --git a/clang/lib/Format/FormatTokenLexer.cpp b/clang/lib/Format/FormatTokenLexer.cpp
index 63949b2e26bdc1..2cdf6cd286b280 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -72,6 +72,8 @@ FormatTokenLexer::FormatTokenLexer(
     Macros.insert({Identifier, TT_StatementAttributeLikeMacro});
   }
 
+  for (const auto &TemplateName : Style.TemplateNames)
+    TemplateNames.insert(&IdentTable.get(TemplateName));
   for (const auto &TypeName : Style.TypeNames)
     TypeNames.insert(&IdentTable.get(TypeName));
 }
@@ -1368,6 +1370,8 @@ FormatToken *FormatTokenLexer::getNextToken() {
         FormatTok->setType(TT_MacroBlockBegin);
       else if (MacroBlockEndRegex.match(Text))
         FormatTok->setType(TT_MacroBlockEnd);
+      else if (TemplateNames.contains(Identifier))
+        FormatTok->setFinalizedType(TT_TemplateName);
       else if (TypeNames.contains(Identifier))
         FormatTok->setFinalizedType(TT_TypeName);
     }
diff --git a/clang/lib/Format/FormatTokenLexer.h b/clang/lib/Format/FormatTokenLexer.h
index 277cc0a2dfde66..71389d2ade2b73 100644
--- a/clang/lib/Format/FormatTokenLexer.h
+++ b/clang/lib/Format/FormatTokenLexer.h
@@ -129,7 +129,7 @@ class FormatTokenLexer {
 
   llvm::SmallMapVector<IdentifierInfo *, TokenType, 8> Macros;
 
-  llvm::SmallPtrSet<IdentifierInfo *, 8> TypeNames;
+  llvm::SmallPtrSet<IdentifierInfo *, 8> TemplateNames, TypeNames;
 
   bool FormattingDisabled;
 
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index f665ce2ad81eb0..0d06b4ee5ea6be 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -149,36 +149,36 @@ class AnnotatingParser {
   }
 
   bool parseAngle() {
-    if (!CurrentToken || !CurrentToken->Previous)
+    if (!CurrentToken)
+      return false;
+
+    auto *Left = CurrentToken->Previous; // The '<'.
+    if (!Left)
       return false;
-    if (NonTemplateLess.count(CurrentToken->Previous) > 0)
+
+    if (NonTemplateLess.count(Left) > 0)
       return false;
 
-    if (const auto &Previous = *CurrentToken->Previous; // The '<'.
-        Previous.Previous) {
-      if (Previous.Previous->Tok.isLiteral())
+    const auto *BeforeLess = Left->Previous;
+
+    if (BeforeLess) {
+      if (BeforeLess->Tok.isLiteral())
         return false;
-      if (Previous.Previous->is(tok::r_brace))
+      if (BeforeLess->is(tok::r_brace))
         return false;
-      if (Previous.Previous->is(tok::r_paren) && Contexts.size() > 1 &&
-          (!Previous.Previous->MatchingParen ||
-           Previous.Previous->MatchingParen->isNot(
-               TT_OverloadedOperatorLParen))) {
+      if (BeforeLess->is(tok::r_paren) && Contexts.size() > 1 &&
+          !(BeforeLess->MatchingParen &&
+            BeforeLess->MatchingParen->is(TT_OverloadedOperatorLParen))) {
         return false;
       }
-      if (Previous.Previous->is(tok::kw_operator) &&
-          CurrentToken->is(tok::l_paren)) {
+      if (BeforeLess->is(tok::kw_operator) && CurrentToken->is(tok::l_paren))
         return false;
-      }
     }
 
-    FormatToken *Left = CurrentToken->Previous;
     Left->ParentBracket = Contexts.back().ContextKind;
     ScopedContextCreator ContextCreator(*this, tok::less, 12);
     Contexts.back().IsExpression = false;
 
-    const auto *BeforeLess = Left->Previous;
-
     // If there's a template keyword before the opening angle bracket, this is a
     // template parameter, not an argument.
     if (BeforeLess && BeforeLess->isNot(tok::kw_template))
@@ -229,6 +229,10 @@ class AnnotatingParser {
         next();
         return true;
       }
+      if (BeforeLess && BeforeLess->is(TT_TemplateName)) {
+        next();
+        continue;
+      }
       if (CurrentToken->is(tok::question) &&
           Style.Language == FormatStyle::LK_Java) {
         next();
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp
index 1884d41a5f23f5..ac072822d2b6ef 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -3505,6 +3505,23 @@ TEST_F(TokenAnnotatorTest, SplitPenalty) {
   EXPECT_SPLIT_PENALTY(Tokens[7], 23u);
 }
 
+TEST_F(TokenAnnotatorTest, TemplateName) {
+  constexpr StringRef Code{"return Foo < A || B > (C ^ D);"};
+
+  auto Tokens = annotate(Code);
+  ASSERT_EQ(Tokens.size(), 14u) << Tokens;
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_BinaryOperator);
+  EXPECT_TOKEN(Tokens[6], tok::greater, TT_BinaryOperator);
+
+  auto Style = getLLVMStyle();
+  Style.TemplateNames.push_back("Foo");
+
+  Tokens = annotate(Code, Style);
+  EXPECT_TOKEN(Tokens[1], tok::identifier, TT_TemplateName);
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_TemplateOpener);
+  EXPECT_TOKEN(Tokens[6], tok::greater, TT_TemplateCloser);
+}
+
 } // namespace
 } // namespace format
 } // namespace clang

@llvmbot
Copy link
Member

llvmbot commented Sep 25, 2024

@llvm/pr-subscribers-clang

Author: Owen Pan (owenca)

Changes

Closes #109912.


Full diff: https://github.com/llvm/llvm-project/pull/109916.diff

9 Files Affected:

  • (modified) clang/docs/ClangFormatStyleOptions.rst (+9)
  • (modified) clang/docs/ReleaseNotes.rst (+1)
  • (modified) clang/include/clang/Format/Format.h (+10)
  • (modified) clang/lib/Format/Format.cpp (+1)
  • (modified) clang/lib/Format/FormatToken.h (+1)
  • (modified) clang/lib/Format/FormatTokenLexer.cpp (+4)
  • (modified) clang/lib/Format/FormatTokenLexer.h (+1-1)
  • (modified) clang/lib/Format/TokenAnnotator.cpp (+20-16)
  • (modified) clang/unittests/Format/TokenAnnotatorTest.cpp (+17)
diff --git a/clang/docs/ClangFormatStyleOptions.rst b/clang/docs/ClangFormatStyleOptions.rst
index a427d7cd40fcdd..a16edb0989b05c 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -6554,6 +6554,15 @@ the configuration (without a prefix: ``Auto``).
     let DAGArgOtherID = (other i32:$other1, i32:$other2);
     let DAGArgBang = (!cast<SomeType>("Some") i32:$src1, i32:$src2)
 
+.. _TemplateNames:
+
+**TemplateNames** (``List of Strings``) :versionbadge:`clang-format 20` :ref:`¶ <TemplateNames>`
+  A vector of non-keyword identifiers that should be interpreted as
+  template names.
+
+  A ``<`` after a template name is annotated as a template opener instead of
+  a binary operator.
+
 .. _TypeNames:
 
 **TypeNames** (``List of Strings``) :versionbadge:`clang-format 17` :ref:`¶ <TypeNames>`
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index f4535db7356194..b86a40b7cc9407 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -532,6 +532,7 @@ clang-format
 ------------
 
 - Adds ``BreakBinaryOperations`` option.
+- Adds ``TemplateNames`` option.
 
 libclang
 --------
diff --git a/clang/include/clang/Format/Format.h b/clang/include/clang/Format/Format.h
index d8b62c7652a0f6..53a9577e0f72e7 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -4974,6 +4974,15 @@ struct FormatStyle {
   /// \version 3.7
   unsigned TabWidth;
 
+  /// A vector of non-keyword identifiers that should be interpreted as
+  /// template names.
+  ///
+  /// A ``<`` after a template name is annotated as a template opener instead of
+  /// a binary operator.
+  ///
+  /// \version 20
+  std::vector<std::string> TemplateNames;
+
   /// A vector of non-keyword identifiers that should be interpreted as type
   /// names.
   ///
@@ -5230,6 +5239,7 @@ struct FormatStyle {
            TableGenBreakingDAGArgOperators ==
                R.TableGenBreakingDAGArgOperators &&
            TableGenBreakInsideDAGArg == R.TableGenBreakInsideDAGArg &&
+           TabWidth == R.TabWidth && TemplateNames == R.TemplateNames &&
            TabWidth == R.TabWidth && TypeNames == R.TypeNames &&
            TypenameMacros == R.TypenameMacros && UseTab == R.UseTab &&
            VerilogBreakBetweenInstancePorts ==
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index d2463b892fbb96..5350c66ea5132b 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -1147,6 +1147,7 @@ template <> struct MappingTraits<FormatStyle> {
     IO.mapOptional("TableGenBreakInsideDAGArg",
                    Style.TableGenBreakInsideDAGArg);
     IO.mapOptional("TabWidth", Style.TabWidth);
+    IO.mapOptional("TemplateNames", Style.TemplateNames);
     IO.mapOptional("TypeNames", Style.TypeNames);
     IO.mapOptional("TypenameMacros", Style.TypenameMacros);
     IO.mapOptional("UseTab", Style.UseTab);
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index 03c0cbd60961a2..7d342a7dcca01d 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -179,6 +179,7 @@ namespace format {
   TYPE(TrailingReturnArrow)                                                    \
   TYPE(TrailingUnaryOperator)                                                  \
   TYPE(TypeDeclarationParen)                                                   \
+  TYPE(TemplateName)                                                           \
   TYPE(TypeName)                                                               \
   TYPE(TypenameMacro)                                                          \
   TYPE(UnaryOperator)                                                          \
diff --git a/clang/lib/Format/FormatTokenLexer.cpp b/clang/lib/Format/FormatTokenLexer.cpp
index 63949b2e26bdc1..2cdf6cd286b280 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -72,6 +72,8 @@ FormatTokenLexer::FormatTokenLexer(
     Macros.insert({Identifier, TT_StatementAttributeLikeMacro});
   }
 
+  for (const auto &TemplateName : Style.TemplateNames)
+    TemplateNames.insert(&IdentTable.get(TemplateName));
   for (const auto &TypeName : Style.TypeNames)
     TypeNames.insert(&IdentTable.get(TypeName));
 }
@@ -1368,6 +1370,8 @@ FormatToken *FormatTokenLexer::getNextToken() {
         FormatTok->setType(TT_MacroBlockBegin);
       else if (MacroBlockEndRegex.match(Text))
         FormatTok->setType(TT_MacroBlockEnd);
+      else if (TemplateNames.contains(Identifier))
+        FormatTok->setFinalizedType(TT_TemplateName);
       else if (TypeNames.contains(Identifier))
         FormatTok->setFinalizedType(TT_TypeName);
     }
diff --git a/clang/lib/Format/FormatTokenLexer.h b/clang/lib/Format/FormatTokenLexer.h
index 277cc0a2dfde66..71389d2ade2b73 100644
--- a/clang/lib/Format/FormatTokenLexer.h
+++ b/clang/lib/Format/FormatTokenLexer.h
@@ -129,7 +129,7 @@ class FormatTokenLexer {
 
   llvm::SmallMapVector<IdentifierInfo *, TokenType, 8> Macros;
 
-  llvm::SmallPtrSet<IdentifierInfo *, 8> TypeNames;
+  llvm::SmallPtrSet<IdentifierInfo *, 8> TemplateNames, TypeNames;
 
   bool FormattingDisabled;
 
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index f665ce2ad81eb0..0d06b4ee5ea6be 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -149,36 +149,36 @@ class AnnotatingParser {
   }
 
   bool parseAngle() {
-    if (!CurrentToken || !CurrentToken->Previous)
+    if (!CurrentToken)
+      return false;
+
+    auto *Left = CurrentToken->Previous; // The '<'.
+    if (!Left)
       return false;
-    if (NonTemplateLess.count(CurrentToken->Previous) > 0)
+
+    if (NonTemplateLess.count(Left) > 0)
       return false;
 
-    if (const auto &Previous = *CurrentToken->Previous; // The '<'.
-        Previous.Previous) {
-      if (Previous.Previous->Tok.isLiteral())
+    const auto *BeforeLess = Left->Previous;
+
+    if (BeforeLess) {
+      if (BeforeLess->Tok.isLiteral())
         return false;
-      if (Previous.Previous->is(tok::r_brace))
+      if (BeforeLess->is(tok::r_brace))
         return false;
-      if (Previous.Previous->is(tok::r_paren) && Contexts.size() > 1 &&
-          (!Previous.Previous->MatchingParen ||
-           Previous.Previous->MatchingParen->isNot(
-               TT_OverloadedOperatorLParen))) {
+      if (BeforeLess->is(tok::r_paren) && Contexts.size() > 1 &&
+          !(BeforeLess->MatchingParen &&
+            BeforeLess->MatchingParen->is(TT_OverloadedOperatorLParen))) {
         return false;
       }
-      if (Previous.Previous->is(tok::kw_operator) &&
-          CurrentToken->is(tok::l_paren)) {
+      if (BeforeLess->is(tok::kw_operator) && CurrentToken->is(tok::l_paren))
         return false;
-      }
     }
 
-    FormatToken *Left = CurrentToken->Previous;
     Left->ParentBracket = Contexts.back().ContextKind;
     ScopedContextCreator ContextCreator(*this, tok::less, 12);
     Contexts.back().IsExpression = false;
 
-    const auto *BeforeLess = Left->Previous;
-
     // If there's a template keyword before the opening angle bracket, this is a
     // template parameter, not an argument.
     if (BeforeLess && BeforeLess->isNot(tok::kw_template))
@@ -229,6 +229,10 @@ class AnnotatingParser {
         next();
         return true;
       }
+      if (BeforeLess && BeforeLess->is(TT_TemplateName)) {
+        next();
+        continue;
+      }
       if (CurrentToken->is(tok::question) &&
           Style.Language == FormatStyle::LK_Java) {
         next();
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp
index 1884d41a5f23f5..ac072822d2b6ef 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -3505,6 +3505,23 @@ TEST_F(TokenAnnotatorTest, SplitPenalty) {
   EXPECT_SPLIT_PENALTY(Tokens[7], 23u);
 }
 
+TEST_F(TokenAnnotatorTest, TemplateName) {
+  constexpr StringRef Code{"return Foo < A || B > (C ^ D);"};
+
+  auto Tokens = annotate(Code);
+  ASSERT_EQ(Tokens.size(), 14u) << Tokens;
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_BinaryOperator);
+  EXPECT_TOKEN(Tokens[6], tok::greater, TT_BinaryOperator);
+
+  auto Style = getLLVMStyle();
+  Style.TemplateNames.push_back("Foo");
+
+  Tokens = annotate(Code, Style);
+  EXPECT_TOKEN(Tokens[1], tok::identifier, TT_TemplateName);
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_TemplateOpener);
+  EXPECT_TOKEN(Tokens[6], tok::greater, TT_TemplateCloser);
+}
+
 } // namespace
 } // namespace format
 } // namespace clang

Copy link
Contributor

@mydeveloperday mydeveloperday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this as a "Get our of jail free card", we've seen enough ambiguous cases for this to need to be a thing for those who face challenges

@owenca owenca merged commit 688bc95 into llvm:main Oct 3, 2024
12 checks passed
@owenca owenca deleted the TemplateNames branch October 3, 2024 01:10
@owenca owenca removed the clang Clang issues not falling into any other category label Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add TemplateNames option to clang-format
4 participants