Skip to content

Add PR check to suggest alternatives to using undef #118506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 11, 2024
Merged

Add PR check to suggest alternatives to using undef #118506

merged 10 commits into from
Dec 11, 2024

Conversation

nunoplopes
Copy link
Member

As discussed in discourse, here's a patch that adds a comment to PRs that introduce new uses of undef.

It uses the the `git diff -S' command to find new uses, to avoid warning about old uses. It further trims down with a regex to get only added (+) lines.

See here for the script in action: https://github.com/nunoplopes/llvm2/pull/12#issuecomment-2514942240

Free feel to suggest better wording for the text, ofc!

Copy link

github-actions bot commented Dec 3, 2024

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r 7802fb5f514be327576b69569556ec9096e5fdd7...d7befbb080dc69c3fff307161adccbe962d2adda llvm/utils/git/code-format-helper.py
View the diff from darker here.
--- code-format-helper.py	2024-12-11 12:48:11.000000 +0000
+++ code-format-helper.py	2024-12-11 12:51:31.660221 +0000
@@ -324,11 +324,21 @@
 
     def filter_changed_files(self, changed_files: List[str]) -> List[str]:
         filtered_files = []
         for path in changed_files:
             _, ext = os.path.splitext(path)
-            if ext in (".cpp", ".c", ".h", ".hpp", ".hxx", ".cxx", ".inc", ".cppm", ".ll"):
+            if ext in (
+                ".cpp",
+                ".c",
+                ".h",
+                ".hpp",
+                ".hxx",
+                ".cxx",
+                ".inc",
+                ".cppm",
+                ".ll",
+            ):
                 filtered_files.append(path)
         return filtered_files
 
     def has_tool(self) -> bool:
         return True

@nunoplopes nunoplopes requested a review from regehr December 4, 2024 09:24
@@ -312,7 +313,84 @@ def format_run(self, changed_files: List[str], args: FormatArgs) -> Optional[str
return None


ALL_FORMATTERS = (DarkerFormatHelper(), ClangFormatHelper())
class UndefGetFormatHelper(FormatHelper):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little strange this being in code-format-helper.py but that's perhaps more the naming of the file than this being in the wrong place. If we think of this as "stuff that runs on the changed lines" then it's not so weird. Plus you get for free the feature that updates the comment once the PR author fixes it, duplicating that would be annoying.

I would at least override pr_comment_text_for_diff because the explanation from the tool is not a diff. Also if you do that, you can present it as Markdown with code blocks so it's nicer to read.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I had the same thought as @DavidSpickett at first. I always thought of code-format-helper as something that's just checking formatting issues with clang format and friends. But I guess it makes sense to have other checks without having to create a separate github flow for it.

But I think we need to change the name of the script and the integration name in that case to really make it a bit more generic. Otherwise I think it might cause confusion.

What do you think @boomanaiden154 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping it in here seems like the best place. It allows taking advantage of the existing infrastructure. I agree that we should probably change the name to reflect this later.

Perfectly fine to leave that to a follow up patch in my opinion though. There are probably some other things that can get cleaned up at the same time.

@nunoplopes
Copy link
Member Author

Thank you for the feedback. I've made the requested changes.

Here's the new comment: https://github.com/nunoplopes/llvm2/pull/12#issuecomment-2517591665

@DavidSpickett
Copy link
Collaborator

That looks much better.

But I wonder if you will need to expand the comment if for example folks (like me :) ) have little idea what undef and poison are in the first place. So if you put this somewhere in the documentation and linked to it from the comment, you can update the documentation and everyone benefits. Like:

See [this documentation](link) for more explanation and alternatives to using undef."

And for anyone who starts with the documentation, they will be aware of how (not) to use undef as well.

Maybe it could go in the langref as part of the description of undef?

@nunoplopes
Copy link
Member Author

Ok, I added a link to langref. The message itself gives an example on how to avoid the most common undef mistake.

I think the patch is a good starting point to go live. We can work through more documentation afterwards, I think.

@regehr
Copy link
Contributor

regehr commented Dec 5, 2024

I don't have any opinion about the specific mechanism used here, but I love this and I think it's a great direction for LLVM. I agree we probably want to evolve the message to contain more specifics and details.

@nunoplopes
Copy link
Member Author

Added link to the new UB manual. Ok to merge now?

@nunoplopes nunoplopes merged commit 19bc282 into llvm:main Dec 11, 2024
5 of 8 checks passed
@philnik777
Copy link
Contributor

This flags #undef: #118167 (comment).

@nunoplopes please fix this soon, since it's quite confusing for new contributors.

@nunoplopes
Copy link
Member Author

This flags #undef: #118167 (comment).

@nunoplopes please fix this soon, since it's quite confusing for new contributors.

Regex tweaked here: 0c62920
Please let me know if come across other cases that need tweaking. Thanks!

@MacDue
Copy link
Member

MacDue commented Feb 21, 2025

@nunoplopes Just using the word "undef" in a C++ comment will also trigger this, e.g.:

// Using undef in a comment will trigger the bot

@nunoplopes
Copy link
Member Author

@nunoplopes Just using the word "undef" in a C++ comment will also trigger this, e.g.:

// Using undef in a comment will trigger the bot

it is impossible to come up with a regex that will cover all cases. Nevertheless, writing undef is probably wrong. Maybe people want to write undefined behavior (UB) or poison. undef should be avoided as it's misleading.
All warnings generated by tools are subject to false positives; it's up to the reviewers to exercise their best judgement.

@CarlosAlbertoEnciso
Copy link
Member

@nunoplopes Just using the word "undef" in a C++ comment will also trigger this. I think is very confusing.

@boomanaiden154
Copy link
Contributor

nunoplopes Just using the word "undef" in a C++ comment will also trigger this. I think is very confusing.

Have you tested this? It should be specific to .ll files.

@MacDue
Copy link
Member

MacDue commented Apr 7, 2025

nunoplopes Just using the word "undef" in a C++ comment will also trigger this. I think is very confusing.

Have you tested this? It should be specific to .ll files.

I reported this earlier too. It's not just .ll files that are checked as the regex is:

regex = "([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)"

[^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-] is for checking *.ll files and UndefValue::get is for checking C++ files, but the regex makes no distinction and runs both checks on both file kinds.

So a comment like:

// Note this should never be undef or there's ...

Will trigger the bot, see: #134682

@MacDue
Copy link
Member

MacDue commented Apr 7, 2025

I think this tweak should reduce the false positives: #134687

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants