Skip to content

[InstSimplify] Provide information about the range of possible values that ucmp/scmp can return #96410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 24, 2024

Conversation

Poseydon42
Copy link
Contributor

This makes it possible to fold dumb comparisons like ucmp(x, y) == 7.

…false based on the range of possible values of UCMP/SCMP
@Poseydon42 Poseydon42 requested a review from nikic as a code owner June 22, 2024 23:08
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Jun 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 22, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: None (Poseydon42)

Changes

This makes it possible to fold dumb comparisons like ucmp(x, y) == 7.


Full diff: https://github.com/llvm/llvm-project/pull/96410.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+3)
  • (modified) llvm/test/Transforms/InstSimplify/uscmp.ll (+20)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 0df061923f625..6631973f96209 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -9393,6 +9393,9 @@ static ConstantRange getRangeForIntrinsic(const IntrinsicInst &II) {
     if (!II.getParent() || !II.getFunction())
       break;
     return getVScaleRange(II.getFunction(), Width);
+  case Intrinsic::scmp:
+  case Intrinsic::ucmp:
+    return ConstantRange::getNonEmpty(APInt::getAllOnes(Width), APInt::getOneBitSet(Width, 0));
   default:
     break;
   }
diff --git a/llvm/test/Transforms/InstSimplify/uscmp.ll b/llvm/test/Transforms/InstSimplify/uscmp.ll
index db96aa39192d6..47720060acb52 100644
--- a/llvm/test/Transforms/InstSimplify/uscmp.ll
+++ b/llvm/test/Transforms/InstSimplify/uscmp.ll
@@ -229,6 +229,26 @@ define <4 x i8> @ucmp_with_addition_vec(<4 x i32> %x) {
   ret <4 x i8> %2
 }
 
+define i1 @scmp_eq_4(i32 %x, i32 %y) {
+; CHECK-LABEL: define i1 @scmp_eq_4(
+; CHECK-SAME: i32 [[X:%.*]], i32 [[Y:%.*]]) {
+; CHECK-NEXT:    ret i1 false
+;
+  %1 = call i8 @llvm.scmp(i32 %x, i32 %y)
+  %2 = icmp eq i8 %1, 4
+  ret i1 %2
+}
+
+define i1 @ucmp_ne_negative_2(i32 %x, i32 %y) {
+; CHECK-LABEL: define i1 @ucmp_ne_negative_2(
+; CHECK-SAME: i32 [[X:%.*]], i32 [[Y:%.*]]) {
+; CHECK-NEXT:    ret i1 true
+;
+  %1 = call i8 @llvm.ucmp(i32 %x, i32 %y)
+  %2 = icmp ne i8 %1, -2
+  ret i1 %2
+}
+
 ; Negative case: mismatched signedness of predicates
 define i8 @scmp_known_ugt(i32 %x, i32 %y) {
 ; CHECK-LABEL: define i8 @scmp_known_ugt(

@Poseydon42 Poseydon42 force-pushed the uscmp-instsimplify-range branch from cc973cf to efe07d8 Compare June 22, 2024 23:09
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of high level approach, an alternative would be to instead add the range(-1, 2) return attribute to the intrinsic. This has the advantage that all places that understand range attributes will automatically gain support for this.

The downside is that doing this is not entirely straightforward. One could easily add the range to callsites in InstCombine, but it would be better to generate it directly on the declaration. That would be done by adding a new intrinsic property to https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/Intrinsics.td and then generating a call to the appropriate constructor in

void IntrinsicEmitter::EmitAttributes(const CodeGenIntrinsicTable &Ints,
. The tricky thing is that in this case, while the range itself is fixed, the bit width of the range depends on the type overload, which is not the case for any of the existing intrinsic properties. So I think Intrinsic::getAttributes() would need an extra FunctionType * argument, so that the attributes for the correct type could be emitted.

@Poseydon42
Copy link
Contributor Author

I don't know how many places do use the range attribute to aid their optimizations, but if that number is not too high then I feel like the current approach should be preferred to inventing something new for a single use case. Additionally, I think that in many places where range information is used, the actual range can be shrunk depending on the ranges of the arguments (e.g. the default [-1,2) range can be shrunk down to [-1,1) when we know that LHS <= RHS. Therefore, I think it may be more optimal to implement this sort of check in the places that do use the range info separately, instead of adding this information on the level of intrinsic itself.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic
Copy link
Contributor

nikic commented Jun 23, 2024

The point of encoding this in the intrinsic attributes is that it's not for a specific use case, but a general mechanism. Target-independent intrinsics with fixed result ranges are pretty rare, but this comes up for target-specific intrinsics as well. For example arm_mve_pred_v2i currently adds a fixed range attribute via an InstCombine hook: https://github.com/llvm/llvm-project/blob/c19028f364cceb4b2111c7956dcacbc257a96fd4/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp#L195C19-L195C35 x86_sse42_crc32_64_64 instead adds a special case in ValueTracking: https://github.com/llvm/llvm-project/blob/c19028f364cceb4b2111c7956dcacbc257a96fd4/llvm/lib/Analysis/ValueTracking.cpp#L1732C23-L1732C44 We have some NVPTX intrinsics with fixed ranges as well.

Anyway, we should still land this patch in the meantime. I may look into the generic mechanism myself.

@dtcxzyw
Copy link
Member

dtcxzyw commented Jun 23, 2024

Do you have a plan to fold some existing patterns into ucmp/scmp?

@Poseydon42
Copy link
Contributor Author

I do, but probably after the main bulk of optimizations for them is done.

@dtcxzyw dtcxzyw merged commit ffec315 into llvm:main Jun 24, 2024
7 checks passed
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
… that `ucmp`/`scmp` can return (llvm#96410)

This makes it possible to fold dumb comparisons like `ucmp(x, y) == 7`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants