Skip to content

[SCCP] Handle llvm.vscale intrinsic calls #114033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 31, 2024
Merged

Conversation

hazzlim
Copy link
Contributor

@hazzlim hazzlim commented Oct 29, 2024

Teach SCCP to compute a constant range for calls to llvm.vscale intrinsics.

Teach SCCP to compute a constant range for calls to llvm.vscale
intrinsics.
@llvmbot
Copy link
Member

llvmbot commented Oct 29, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-function-specialization

Author: Hari Limaye (hazzlim)

Changes

Teach SCCP to compute a constant range for calls to llvm.vscale intrinsics.


Full diff: https://github.com/llvm/llvm-project/pull/114033.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SCCPSolver.cpp (+6)
  • (added) llvm/test/Transforms/SCCP/vscale-intrinsic.ll (+56)
diff --git a/llvm/lib/Transforms/Utils/SCCPSolver.cpp b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
index c65710ea7551ac..4225e7e80fda6f 100644
--- a/llvm/lib/Transforms/Utils/SCCPSolver.cpp
+++ b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
@@ -1923,6 +1923,12 @@ void SCCPInstVisitor::handleCallResult(CallBase &CB) {
       return (void)mergeInValue(IV, &CB, CopyOfVal);
     }
 
+    if (II->getIntrinsicID() == Intrinsic::vscale) {
+      unsigned BitWidth = CB.getType()->getScalarSizeInBits();
+      const ConstantRange Result = getVScaleRange(II->getFunction(), BitWidth);
+      return (void)mergeInValue(II, ValueLatticeElement::getRange(Result));
+    }
+
     if (ConstantRange::isIntrinsicSupported(II->getIntrinsicID())) {
       // Compute result range for intrinsics supported by ConstantRange.
       // Do this even if we don't know a range for all operands, as we may
diff --git a/llvm/test/Transforms/SCCP/vscale-intrinsic.ll b/llvm/test/Transforms/SCCP/vscale-intrinsic.ll
new file mode 100644
index 00000000000000..ca08c305c3059d
--- /dev/null
+++ b/llvm/test/Transforms/SCCP/vscale-intrinsic.ll
@@ -0,0 +1,56 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=sccp -S | FileCheck %s
+
+define i1 @vscale_i32_noattr() {
+; CHECK-LABEL: define i1 @vscale_i32_noattr() {
+; CHECK-NEXT:    [[SCALE:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp ule i32 [[SCALE]], 16
+; CHECK-NEXT:    [[RES:%.*]] = and i1 true, [[CMP2]]
+; CHECK-NEXT:    ret i1 [[RES]]
+;
+  %scale = call i32 @llvm.vscale.i32()
+  %cmp1 = icmp uge i32 %scale, 1
+  %cmp2 = icmp ule i32 %scale, 16
+  %res = and i1 %cmp1, %cmp2
+  ret i1 %res
+}
+
+define i1 @vscale_i32_attr() vscale_range(1, 16) {
+; CHECK-LABEL: define i1 @vscale_i32_attr(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    [[SCALE:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i1 true
+;
+  %scale = call i32 @llvm.vscale.i32()
+  %cmp1 = icmp uge i32 %scale, 1
+  %cmp2 = icmp ule i32 %scale, 16
+  %res = and i1 %cmp1, %cmp2
+  ret i1 %res
+}
+
+define i1 @vscale_i64_noattr() {
+; CHECK-LABEL: define i1 @vscale_i64_noattr() {
+; CHECK-NEXT:    [[SCALE:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp ule i64 [[SCALE]], 16
+; CHECK-NEXT:    [[RES:%.*]] = and i1 true, [[CMP2]]
+; CHECK-NEXT:    ret i1 [[RES]]
+;
+  %scale = call i64 @llvm.vscale.i64()
+  %cmp1 = icmp uge i64 %scale, 1
+  %cmp2 = icmp ule i64 %scale, 16
+  %res = and i1 %cmp1, %cmp2
+  ret i1 %res
+}
+
+define i1 @vscale_i64_attr() vscale_range(1, 16) {
+; CHECK-LABEL: define i1 @vscale_i64_attr(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:    [[SCALE:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:    ret i1 true
+;
+  %scale = call i64 @llvm.vscale.i64()
+  %cmp1 = icmp uge i64 %scale, 1
+  %cmp2 = icmp ule i64 %scale, 16
+  %res = and i1 %cmp1, %cmp2
+  ret i1 %res
+}

@dtcxzyw dtcxzyw requested review from nikic and fhahn October 29, 2024 14:28
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to handling more intrinsics in SCCP. But do we really need it?
Can you provide some tests that cannot be simplified by InstSimplify/InstCombine?

@hazzlim
Copy link
Contributor Author

hazzlim commented Oct 29, 2024

I'm not opposed to handling more intrinsics in SCCP. But do we really need it? Can you provide some tests that cannot be simplified by InstSimplify/InstCombine?

The motivating case for this is actually to make use of the constantrange information in Function Specialization. We see cases like the following:

define i32 @void(i32 %x) vscale_range(1, 16) {
...
  %scale = call i32 @llvm.vscale.i32
  %bound = shl nsw nuw i32 %scale, 3
  %cmp = icmp ult i32 %x, %bound
  br i1 %cmp, label %after, label %work

work:
...do a lot of work...
  br label %after

after:
...

We currently aren't able to detect that a specialization candidate, e.g. { x = 1 }, can make the %work block dead, because in IPSCCP the SCCPSolver hasn't found %bound to have a constant range. This prevents us from properly estimating the codesize savings for a specialization.

@dtcxzyw
Copy link
Member

dtcxzyw commented Oct 29, 2024

I'm not opposed to handling more intrinsics in SCCP. But do we really need it? Can you provide some tests that cannot be simplified by InstSimplify/InstCombine?

The motivating case for this is actually to make use of the constantrange information in Function Specialization. We see cases like the following:

define i32 @void(i32 %x) vscale_range(1, 16) {
...
  %scale = call i32 @llvm.vscale.i32
  %bound = shl nsw nuw i32 %scale, 3
  %cmp = icmp ult i32 %x, %bound
  br i1 %cmp, label %after, label %work

work:
...do a lot of work...
  br label %after

after:
...

We currently aren't able to detect that a specialization candidate, e.g. { x = 1 }, can make the %work block dead, because in IPSCCP the SCCPSolver hasn't found %bound to have a constant range. This prevents us from properly estimating the codesize savings for a specialization.

Make sense to me. Can you add this case?

@hazzlim
Copy link
Contributor Author

hazzlim commented Oct 29, 2024

I'm not opposed to handling more intrinsics in SCCP. But do we really need it? Can you provide some tests that cannot be simplified by InstSimplify/InstCombine?

The motivating case for this is actually to make use of the constantrange information in Function Specialization. We see cases like the following:

define i32 @void(i32 %x) vscale_range(1, 16) {
...
  %scale = call i32 @llvm.vscale.i32
  %bound = shl nsw nuw i32 %scale, 3
  %cmp = icmp ult i32 %x, %bound
  br i1 %cmp, label %after, label %work

work:
...do a lot of work...
  br label %after

after:
...

We currently aren't able to detect that a specialization candidate, e.g. { x = 1 }, can make the %work block dead, because in IPSCCP the SCCPSolver hasn't found %bound to have a constant range. This prevents us from properly estimating the codesize savings for a specialization.

Make sense to me. Can you add this case?

I've added a similar test case of branch elimination in SCCP. The full case in FunctionSpecialization case requires an additional change (in #114073)

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@hazzlim hazzlim merged commit b396921 into llvm:main Oct 31, 2024
8 checks passed
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
Teach SCCP to compute a constant range for calls to llvm.vscale
intrinsics.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Teach SCCP to compute a constant range for calls to llvm.vscale
intrinsics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants