Skip to content

[docs] Add guide for Undefined Behavior #119220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 11, 2024
Merged

[docs] Add guide for Undefined Behavior #119220

merged 12 commits into from
Dec 11, 2024

Conversation

nunoplopes
Copy link
Member

Add a new guide about undefined behavior.

Preview here: https://web.ist.utl.pt/nuno.lopes/UndefinedBehavior.html

This is to be linked from the automatic code review tool, that is now warning about undef (#118506). In the future it could use Alive2 to warn about tests that are full UB.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, and this looks pretty reasonable at a glance.

I think one topic that may be worth mentioning here is the "logical and/or" pattern. It's a non-obvious construction used to get the correct poison propagation semantics.

Not sure if it's out of scope, but we could also mention related APIs, like isGuaranteedNotToBePoison.

@nunoplopes
Copy link
Member Author

Thanks for the quick review! I've patched the things you mentioned.

I think having a section about APIs makes sense! For the and/or logic, do we have any API you want to mention? Or do you have an example off the top of your head that we can show?

@regehr
Copy link
Contributor

regehr commented Dec 9, 2024

Nuno this is great and I don't know why we didn't do it like >5 years ago.

Maybe link to the "taming undefined behavior" paper, for people who want more gory details?

Maybe we should include a bunch of links into our CE instance so that people can play with this stuff and see how refinement checks fall out in practice? Of course this shouldn't block this from getting added, we can add stuff later, and I'm happy to help.

Also I think we can write quite a bit more in the "writing tests that avoid UB" part.

But anyway I think this adds value as-is and there's no reason not to merge it, then keep working.

@nunoplopes
Copy link
Member Author

Yes, I agree that we can add a few more things. I'm sure we'll get questions about it, so we can refine as we go.
I just wanted to get something started 🙂 And something to like from the automatic code reviewer. We need more of these automatic things to improve the quality of the code IMHO.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic
Copy link
Contributor

nikic commented Dec 11, 2024

I think having a section about APIs makes sense! For the and/or logic, do we have any API you want to mention? Or do you have an example off the top of your head that we can show?

For example, we can't transform:

define i1 @src(i32 %x, i32 %y) {
    %cmp1 = icmp ne i32 %x, 0
    %cmp2 = icmp ugt i32 %x, %y
    %and = select i1 %cmp1, i1 %cmp2, i1 false
    ret i1 %and
}

define i1 @tgt(i32 %x, i32 %y) {
    %cmp2 = icmp ugt i32 %x, %y
    ret i1 %cmp2
}

But we can do it for both of these:

define i1 @src2(i32 %x, i32 %y) {
    %cmp1 = icmp ne i32 %x, 0
    %cmp2 = icmp ugt i32 %x, %y
    %and = select i1 %cmp2, i1 %cmp1, i1 false
    ret i1 %and
}

define i1 @src3(i32 %x, i32 %y) {
    %cmp1 = icmp ne i32 %x, 0
    %cmp2 = icmp ugt i32 %x, %y
    %and = and i1 %cmp1, %cmp2
    ret i1 %and
}

The logical and/or form has asymmetric poison propagation limiting which transforms are valid.

@antoniofrighetto
Copy link
Contributor

@nunoplopes Kindly drop Co-authored-by before merging, as I have not coauthored the change.

@nunoplopes
Copy link
Member Author

Thanks @nikic, I'll add that example next.

@nunoplopes nunoplopes merged commit 0100c63 into llvm:main Dec 11, 2024
5 of 7 checks passed
@nunoplopes nunoplopes deleted the ub_docs branch December 11, 2024 12:23
Copy link
Collaborator

@shafik shafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this write-up. I don't usually sit at this level but I do often work with bug reports dealing w/ UB where this matters.

Maybe it does not fit perfectly here, not sure, but I feel like some greater discussion about some of the more unintuitive effects of UB might be helpful.

One of them that has come up often is when function bodies end up merged due to UB e.g. https://godbolt.org/z/8sEE54E7d

@nikic
Copy link
Contributor

nikic commented Dec 12, 2024

@shafik I think this documentation is intended for LLVM developers, not for C/C++ users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants