Skip to content

[VP] Check if VP ops with functional intrinsics are speculatable #69504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 26, 2023

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Oct 18, 2023

Noticed whilst working on #69494. VP intrinsics whose functional equivalent is
an intrinsic were being marked as their lanes being non-speculatable, even if
the underlying intrinsic was speculatable.

This meant that

  %1 = call <4 x i32> @llvm.vp.umax(<4 x i32> %x, <4 x i32> %y, <4 x i1> %mask, i32 %evl)

would be expanded out to

  %.splatinsert = insertelement <4 x i32> poison, i32 %evl, i64 0
  %.splat = shufflevector <4 x i32> %.splatinsert, <4 x i32> poison, <4 x i32> zeroinitializer
  %1 = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, %.splat
  %2 = and <4 x i1> %1, %mask
  %3 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)

instead of

  %1 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)

The cause of this was isSafeToSpeculativelyExecuteWithOpcode checking the
function attributes for the VP instruction itself, not the functional
intrinsic. Since isSafeToSpeculativelyExecuteWithOpcode expects an already
materialized instruction, we can't use it directly for the intrinsic case. So
this fixes it by manually checking the function attributes on the intrinsic.

Noticed whilst working on llvm#69494. VP intrinsics whose functional equivalent is
an intrinsic were being marked as their lanes being non-speculatable, even if
the underlying intrinsic was speculatable.

This meant that

  %1 = call <4 x i32> @llvm.vp.umax(<4 x i32> %x, <4 x i32> %y, <4 x i1> %mask, i32 %evl)

would be expanded out to

  %.splatinsert = insertelement <4 x i32> poison, i32 %evl, i64 0
  %.splat = shufflevector <4 x i32> %.splatinsert, <4 x i32> poison, <4 x i32> zeroinitializer
  %1 = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, %.splat
  %2 = and <4 x i1> %1, %mask
  %3 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)

instead of

  %1 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)

The cause of this was isSafeToSpeculativelyExecuteWithOpcode checking the
function attributes for the VP instruction itself, not the functional
intrinsic.  Since isSafeToSpeculativelyExecuteWithOpcode expects an already
materialized instruction, we can't use it directly for the intrinsic case. So
this fixes it by manually checking the function attributes on the intrinsic.
@RKSimon
Copy link
Collaborator

RKSimon commented Oct 19, 2023

Please can you precommit the test changes so the patch properly shows the codegen diff

@lukel97
Copy link
Contributor Author

lukel97 commented Oct 19, 2023

Please can you precommit the test changes so the patch properly shows the codegen diff

I wanted to do that originally but expand-vp.ll is a handwritten test unfortunately. I'm tempted to convert it to update_test_checks.py, should I do that in a separate PR?

@lukel97
Copy link
Contributor Author

lukel97 commented Oct 26, 2023

Ping. Looks like the test was intentionally handwritten, with the agreement to convert it to UTC once caching had been implemented: https://reviews.llvm.org/D78203#inline-949565

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukel97 lukel97 merged commit 2e85123 into llvm:main Oct 26, 2023
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
…m#69504)

Noticed whilst working on llvm#69494. VP intrinsics whose functional
equivalent is
an intrinsic were being marked as their lanes being non-speculatable,
even if
the underlying intrinsic was speculatable.

This meant that

```llvm
  %1 = call <4 x i32> @llvm.vp.umax(<4 x i32> %x, <4 x i32> %y, <4 x i1> %mask, i32 %evl)
```

would be expanded out to

```llvm
  %.splatinsert = insertelement <4 x i32> poison, i32 %evl, i64 0
  %.splat = shufflevector <4 x i32> %.splatinsert, <4 x i32> poison, <4 x i32> zeroinitializer
  %1 = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, %.splat
  %2 = and <4 x i1> %1, %mask
  %3 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)
```

instead of

```llvm
  %1 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)
```

The cause of this was isSafeToSpeculativelyExecuteWithOpcode checking
the
function attributes for the VP instruction itself, not the functional
intrinsic. Since isSafeToSpeculativelyExecuteWithOpcode expects an
already
materialized instruction, we can't use it directly for the intrinsic
case. So
this fixes it by manually checking the function attributes on the
intrinsic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants