Skip to content

[LLVM][Parser] Check invalid overload suffix for intrinsics #108315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jurahul
Copy link
Contributor

@jurahul jurahul commented Sep 12, 2024

No description provided.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this break the remangling upgrade?

@jurahul
Copy link
Contributor Author

jurahul commented Sep 12, 2024

Maybe. I didn't know what it is until now. I understand that in the bitcode reader context but did not know that it was supported in the asm parser as well. So currently it looks like we allow any suffix when parsing (and essentially ignore it), so old versions of intrinsics with say different types or different mangling scheme still get parsed and then get auto-upgraded (since the correct mangling can be always constructed from the existing types in the IR). And there is no check to verify that the ignored suffix was a valid suffix in one of the older versions of the intrinsic.

It seems to me this check could still be useful in some cases. For example, I see ~113 test failures in LLVM due to this and it looks like a lot of them are just typos that get glossed over by the parser. Maybe we can add a mode to enable strict-intrinsic-overload-mangle to llvm-as (and by definition to LLParser and LLVM's assembly parsing API)? The API will disable it by default, so existing users of the API (upstream or downstream) are unaffected. llvm-as will enable it by default, so LLVM lit tests can be strict. Or even the LIT testing infra can enable the strict mode for llvm-as and other tools, with a way to disable it for specific tests. That will ensure that we don't unintentionally let bad IR sneak in and stay un-flagged.

WDYT? If this seems reasonable, I can also start a discourse thread if needed.

@jurahul
Copy link
Contributor Author

jurahul commented Sep 12, 2024

Additionally, if mangling can always be reconstructed, maybe LLVM should always elide it? I know that's a bigger change but may be worth considering. That means LLVM assembly syntax won't require mangling (it does not today as well due to it being ignored) but in strict mode LLVM assembly for overloaded intrinsics will have no mangling (We assert that Suffix = "" in the code above) and ASM printer will also hide the mangling. In memoy IR (i.e., Function objects) will still have mangled names (that's required I think as 2 different Global Values cannot have same name).

@jurahul
Copy link
Contributor Author

jurahul commented Sep 12, 2024

In fact, changing ASM writer to drop the mangling in a call inst is easy enough and that code will continue to get parsed. We would just need the parser to allow declaring overloaded intrinsics with same name and different arg/ret types.

@nikic
Copy link
Contributor

nikic commented Sep 12, 2024

We no longer require mangling suffixes for intrinsic calls since #89172.

I don't think we should omit the mangling suffix when writing IR though, since that will obscure what actually happens without clear benefit. (For example, it would become very hard to actually figure out whether the correct mangling suffix was chosen or not.)

Can you please explain what the actual problem you are trying to solve is?

@jurahul
Copy link
Contributor Author

jurahul commented Sep 12, 2024

The problem I ran into was literally this: I was writing a unit test using overloaded intrinsics and observed that no one complained if I messed up the mangling suffix (use .f16 for a float variant for instance), and that motivated this change.

I agree that having mangling in the IR is good. May be the --strict-intrinsic-overloads mode in various tools (llvm-as and opt might be good enough) disabled by default, auto enabled by LIT testing framework is a good middle ground to not break existing code and to not allow bad mangling suffix when not intended?

I am actually not sure what happens if in C++ someone creates a Function::Create() to create an intrinsic decl but with bad mangling suffix. I'd expect that to be strict, not sure if it is (maybe IR verifier checks this?)

It does:

 const std::string ExpectedName =
      Intrinsic::getName(ID, ArgTys, IF->getParent(), IFTy);
  Check(ExpectedName == IF->getName(),
        "Intrinsic name not mangled correctly for type arguments! "
        "Should be: " +
            ExpectedName,
        IF);

@jurahul
Copy link
Contributor Author

jurahul commented Sep 12, 2024

BTW, thanks for proactively looking at the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants