-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Add PR check to suggest alternatives to using undef #118506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
You can test this locally with the following command:darker --check --diff -r 7802fb5f514be327576b69569556ec9096e5fdd7...d7befbb080dc69c3fff307161adccbe962d2adda llvm/utils/git/code-format-helper.py View the diff from darker here.--- code-format-helper.py 2024-12-11 12:48:11.000000 +0000
+++ code-format-helper.py 2024-12-11 12:51:31.660221 +0000
@@ -324,11 +324,21 @@
def filter_changed_files(self, changed_files: List[str]) -> List[str]:
filtered_files = []
for path in changed_files:
_, ext = os.path.splitext(path)
- if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx", ".inc", ".cppm", ".ll"):
+ if ext in (
+ ".cpp",
+ ".c",
+ ".h",
+ ".hpp",
+ ".hxx",
+ ".cxx",
+ ".inc",
+ ".cppm",
+ ".ll",
+ ):
filtered_files.append(path)
return filtered_files
def has_tool(self) -> bool:
return True
|
@@ -312,7 +313,84 @@ def format_run(self, changed_files: List[str], args: FormatArgs) -> Optional[str | |||
return None | |||
|
|||
|
|||
ALL_FORMATTERS = (DarkerFormatHelper(), ClangFormatHelper()) | |||
class UndefGetFormatHelper(FormatHelper): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little strange this being in code-format-helper.py but that's perhaps more the naming of the file than this being in the wrong place. If we think of this as "stuff that runs on the changed lines" then it's not so weird. Plus you get for free the feature that updates the comment once the PR author fixes it, duplicating that would be annoying.
I would at least override pr_comment_text_for_diff because the explanation from the tool is not a diff. Also if you do that, you can present it as Markdown with code blocks so it's nicer to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I had the same thought as @DavidSpickett at first. I always thought of code-format-helper as something that's just checking formatting issues with clang format and friends. But I guess it makes sense to have other checks without having to create a separate github flow for it.
But I think we need to change the name of the script and the integration name in that case to really make it a bit more generic. Otherwise I think it might cause confusion.
What do you think @boomanaiden154 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping it in here seems like the best place. It allows taking advantage of the existing infrastructure. I agree that we should probably change the name to reflect this later.
Perfectly fine to leave that to a follow up patch in my opinion though. There are probably some other things that can get cleaned up at the same time.
Thank you for the feedback. I've made the requested changes. Here's the new comment: https://github.com/nunoplopes/llvm2/pull/12#issuecomment-2517591665 |
That looks much better. But I wonder if you will need to expand the comment if for example folks (like me :) ) have little idea what undef and poison are in the first place. So if you put this somewhere in the documentation and linked to it from the comment, you can update the documentation and everyone benefits. Like:
And for anyone who starts with the documentation, they will be aware of how (not) to use undef as well. Maybe it could go in the langref as part of the description of undef? |
Ok, I added a link to langref. The message itself gives an example on how to avoid the most common undef mistake. I think the patch is a good starting point to go live. We can work through more documentation afterwards, I think. |
I don't have any opinion about the specific mechanism used here, but I love this and I think it's a great direction for LLVM. I agree we probably want to evolve the message to contain more specifics and details. |
Added link to the new UB manual. Ok to merge now? |
This flags @nunoplopes please fix this soon, since it's quite confusing for new contributors. |
Regex tweaked here: 0c62920 |
@nunoplopes Just using the word "undef" in a C++ comment will also trigger this, e.g.:
|
it is impossible to come up with a regex that will cover all cases. Nevertheless, writing undef is probably wrong. Maybe people want to write undefined behavior (UB) or poison. undef should be avoided as it's misleading. |
@nunoplopes Just using the word "undef" in a C++ comment will also trigger this. I think is very confusing. |
Have you tested this? It should be specific to |
I reported this earlier too. It's not just
So a comment like: // Note this should never be undef or there's ... Will trigger the bot, see: #134682 |
I think this tweak should reduce the false positives: #134687 |
As discussed in discourse, here's a patch that adds a comment to PRs that introduce new uses of undef.
It uses the the `git diff -S' command to find new uses, to avoid warning about old uses. It further trims down with a regex to get only added (+) lines.
See here for the script in action: https://github.com/nunoplopes/llvm2/pull/12#issuecomment-2514942240
Free feel to suggest better wording for the text, ofc!