Skip to content

[Format] Fix isStartOfName to recognize attributes #76804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 15, 2024

Conversation

ilya-biryukov
Copy link
Contributor

@ilya-biryukov ilya-biryukov commented Jan 3, 2024

This addresses a problem with formatting attributes. Some context:

  • eaff083 changed isStartOfName to fix problems inside #pragmas, but this behavior changed formatting of attribute macros in an undesirable way.
  • efeb546 changed Google format style to fix some widely used attributes.

Instead of changing the format style, this commit specializes behavior introduced in eaff083 to #pragmas. This seems to work well in both cases.

Also update the test with two GUARDED_BY directives. While the formatting after efeb546 seems better, this case is rare enough to not warrant the extra complexity. We are reverting it back to the state it had before
efeb546.

This addresses a problem with formatting attributes in a different way
as before. For some context:
- 199fc97 changed `isStartOfName` to
  fix problems inside macro directives (judging by the added tests), but
  this behavior changed formatting of attribute macros in an undesirable
  way.
- efeb546 changed Google format style
  to fix some widely used attributes.

Instead of changing the format style, this commit specializes behavior
introduced in 199fc97 to macro
directives. This seems to work well in both cases.

Also update the test with two `GUARDED_BY` directives. While the
formatting after efeb546 seems better,
this case is rare enough to not warrant the extra complexity. We are
reverting it back to the state it had before
efeb546.
@llvmbot
Copy link
Member

llvmbot commented Jan 3, 2024

@llvm/pr-subscribers-clang-format

Author: Ilya Biryukov (ilya-biryukov)

Changes

This addresses a problem with formatting attributes. Some context:

  • 199fc97 changed isStartOfName to fix problems inside macro directives (judging by the added tests), but this behavior changed formatting of attribute macros in an undesirable way.
  • efeb546 changed Google format style to fix some widely used attributes.

Instead of changing the format style, this commit specializes behavior introduced in 199fc97 to macro directives. This seems to work well in both cases.

Also update the test with two GUARDED_BY directives. While the formatting after efeb546 seems better, this case is rare enough to not warrant the extra complexity. We are reverting it back to the state it had before
efeb546.


Full diff: https://github.com/llvm/llvm-project/pull/76804.diff

2 Files Affected:

  • (modified) clang/lib/Format/Format.cpp (-2)
  • (modified) clang/lib/Format/TokenAnnotator.cpp (+2-1)
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index f798d555bf9929..38974f578fe1d2 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -1698,8 +1698,6 @@ FormatStyle getGoogleStyle(FormatStyle::LanguageKind Language) {
           /*BasedOnStyle=*/"google",
       },
   };
-  GoogleStyle.AttributeMacros.push_back("GUARDED_BY");
-  GoogleStyle.AttributeMacros.push_back("ABSL_GUARDED_BY");
 
   GoogleStyle.SpacesBeforeTrailingComments = 2;
   GoogleStyle.Standard = FormatStyle::LS_Auto;
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index 3ac3aa3c5e3a22..94fe5b21cfc6e6 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -2209,7 +2209,8 @@ class AnnotatingParser {
         (!NextNonComment && !Line.InMacroBody) ||
         (NextNonComment &&
          (NextNonComment->isPointerOrReference() ||
-          NextNonComment->isOneOf(tok::identifier, tok::string_literal)))) {
+          (Line.InPragmaDirective &&
+           NextNonComment->isOneOf(tok::identifier, tok::string_literal))))) {
       return false;
     }
 

Copy link
Member

@kadircet kadircet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add a test to clang/unittests/Format/TokenAnnotatorTest.cpp that ensures trailing attribute-like macros receive StartOfName annotation to make sure we don't regress the signal in the future?

@ilya-biryukov
Copy link
Contributor Author

can you also add a test to clang/unittests/Format/TokenAnnotatorTest.cpp that ensures trailing attribute-like macros receive StartOfName annotation to make sure we don't regress the signal in the future?

ok, that opened a whole can of worms.

 Tokens = annotate("void foo GUARDED_BY(x)");

gets annotated as

{(void, "void" , Unknown),
  (identifier, "foo" , StartOfName),
  (identifier, "GUARDED_BY" , FunctionDeclarationName),
  (l_paren, "(" , Unknown),
  (identifier, "x" , Unknown),
  (r_paren, ")" , Unknown),
  (eof, "" , Unknown)}

I expected to get some heuristics for attributes, but instead GUARDED_BY gets annotated as a function declaration name.
It feels that the current behavior is a result of two mistakes cancelling each other out. I don't think adding a unit test like this is warranted, even if formatting behavior is actually correct.

@owenca what are your thoughts on this change and whether we should add a test here?

@@ -1698,8 +1698,6 @@ FormatStyle getGoogleStyle(FormatStyle::LanguageKind Language) {
/*BasedOnStyle=*/"google",
},
};
GoogleStyle.AttributeMacros.push_back("GUARDED_BY");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if that would not be needed anymore to achieve the expected formatting, these are AttributeMacros, and should be declared as such.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are attribute macros indeed, the problem is that we actually need more. The ones used with fields are at least:

  • ABSL_PT_GUARDED_BY
  • ABSL_ACQUIRED_AFTER
  • ABSL_ACQUIRED_BEFORE
  • ABSL_GUARDED_BY_FIXME

We could also consider including the annotations for functions, but the patch only broke formatting for variables, so it's not strictly necessary to unblock the release.

If we want to also include the ones that are used with functions (they are not strictly necessary because clang-format does a decent job there without config), we would need to add at least these:

  • ABSL_EXCLUSIVE_LOCKS_REQUIRED
  • ABSL_LOCKS_EXCLUDED
  • ABSL_LOCK_RETURNED
  • ABSL_EXCLUSIVE_LOCK_FUNCTION
  • ABSL_EXCLUSIVE_TRYLOCK_FUNCTION
  • ABSL_SHARED_TRYLOCK_FUNCTION
  • ABSL_ASSERT_EXCLUSIVE_LOCK
  • ABSL_ASSERT_SHARED_LOCK
  • ABSL_NO_THREAD_SAFETY_ANALYSIS

I am not sure how to best approach it and would appreciate some guidance here. Should we have all these attribute macros inside AttributeMacros or should we aim for clang-format formatting them reasonably without configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HazardyKnusperkeks friendly ping. Any thoughts on including a few more attributes into the (the first list of 4 elements) vs landing this change and relying on implicit formatting of those as function names?

I am happy to choose one of the two options arbitrarily, but I don't have enough context on clang-format to understand which approach is preferable, so I would love to get an opinion from someone in the clang-format community.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, not recognizing foo in int foo BAR as start-of-name looks like a big enough regression (which seems to be the main reason behind the line-braking behavior change), independent of whatever we do with the list of attribute-macros, I believe we should still make sure annotations for foo are correct rather urgently. so I am actually still in favor of landing this patch as-is, rather than trying to fix final formatting in a bunch of special cases by updating AtrributeMacros list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open in all directions.

When clang-format does format attribute macros out of the box correctly, that is nice. But I wouldn't put (too much) work into it, if declaring them to clang-format as what they are fixes all misformatting.

Thus I would keep the entries in AttributeMacros.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should postpone the inclusion of those names into AtrributeMacros and land the patch as is.

It seems that there is agreement among everyone that having this formatting without explicit AtrributeMacros is desirable, so landing as is is a no-brainer.

Whether we should add common macro names into AttributeMacros is more contentious, so I think we may need more data to back up our decision to go either way. I have some examples where not having those names in the config leads to undesirable formatting, but I would share them in a follow-up conversation.

@owenca
Copy link
Contributor

owenca commented Jan 9, 2024

can you also add a test to clang/unittests/Format/TokenAnnotatorTest.cpp that ensures trailing attribute-like macros receive StartOfName annotation to make sure we don't regress the signal in the future?

ok, that opened a whole can of worms.

 Tokens = annotate("void foo GUARDED_BY(x)");

gets annotated as

{(void, "void" , Unknown),
  (identifier, "foo" , StartOfName),
  (identifier, "GUARDED_BY" , FunctionDeclarationName),
  (l_paren, "(" , Unknown),
  (identifier, "x" , Unknown),
  (r_paren, ")" , Unknown),
  (eof, "" , Unknown)}

I expected to get some heuristics for attributes, but instead GUARDED_BY gets annotated as a function declaration name. It feels that the current behavior is a result of two mistakes cancelling each other out. I don't think adding a unit test like this is warranted, even if formatting behavior is actually correct.

@owenca what are your thoughts on this change and whether we should add a test here?

We usually add a FIXME test wrapped in a #if 0 block:

// FIXME: ...
#if 0
Tokens = annotate("void foo GUARDED_BY(x);");
...
Tokens = annotate("void foo GUARDED_BY(x) {}");
...
#endif

@ilya-biryukov
Copy link
Contributor Author

@owenca, @HazardyKnusperkeks could you please take another look and approve if this looks good to you?
I believe all comments should be addressed at this point.

@ilya-biryukov ilya-biryukov merged commit 5723fce into llvm:main Jan 15, 2024
@ilya-biryukov ilya-biryukov deleted the format3 branch January 15, 2024 13:57
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
This addresses a problem with formatting attributes. Some context:
- eaff083 changed `isStartOfName` to fix problems inside
`#pragma`s, but this behavior changed formatting of attribute macros in
an undesirable way.
- efeb546 changed Google format style
to fix some widely used attributes.

Instead of changing the format style, this commit specializes behavior
introduced in eaff083 to `#pragma`s. This seems to work well in
both cases.

Also update the test with two `GUARDED_BY` directives. While the
formatting after efeb546 seems better,
this case is rare enough to not warrant the extra complexity. We are
reverting it back to the state it had before
efeb546.

---------

Co-authored-by: Owen Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants