-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[InstSimplify] Provide information about the range of possible values that ucmp
/scmp
can return
#96410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…false based on the range of possible values of UCMP/SCMP
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-analysis Author: None (Poseydon42) ChangesThis makes it possible to fold dumb comparisons like Full diff: https://github.com/llvm/llvm-project/pull/96410.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 0df061923f625..6631973f96209 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -9393,6 +9393,9 @@ static ConstantRange getRangeForIntrinsic(const IntrinsicInst &II) {
if (!II.getParent() || !II.getFunction())
break;
return getVScaleRange(II.getFunction(), Width);
+ case Intrinsic::scmp:
+ case Intrinsic::ucmp:
+ return ConstantRange::getNonEmpty(APInt::getAllOnes(Width), APInt::getOneBitSet(Width, 0));
default:
break;
}
diff --git a/llvm/test/Transforms/InstSimplify/uscmp.ll b/llvm/test/Transforms/InstSimplify/uscmp.ll
index db96aa39192d6..47720060acb52 100644
--- a/llvm/test/Transforms/InstSimplify/uscmp.ll
+++ b/llvm/test/Transforms/InstSimplify/uscmp.ll
@@ -229,6 +229,26 @@ define <4 x i8> @ucmp_with_addition_vec(<4 x i32> %x) {
ret <4 x i8> %2
}
+define i1 @scmp_eq_4(i32 %x, i32 %y) {
+; CHECK-LABEL: define i1 @scmp_eq_4(
+; CHECK-SAME: i32 [[X:%.*]], i32 [[Y:%.*]]) {
+; CHECK-NEXT: ret i1 false
+;
+ %1 = call i8 @llvm.scmp(i32 %x, i32 %y)
+ %2 = icmp eq i8 %1, 4
+ ret i1 %2
+}
+
+define i1 @ucmp_ne_negative_2(i32 %x, i32 %y) {
+; CHECK-LABEL: define i1 @ucmp_ne_negative_2(
+; CHECK-SAME: i32 [[X:%.*]], i32 [[Y:%.*]]) {
+; CHECK-NEXT: ret i1 true
+;
+ %1 = call i8 @llvm.ucmp(i32 %x, i32 %y)
+ %2 = icmp ne i8 %1, -2
+ ret i1 %2
+}
+
; Negative case: mismatched signedness of predicates
define i8 @scmp_known_ugt(i32 %x, i32 %y) {
; CHECK-LABEL: define i8 @scmp_known_ugt(
|
…P/SCMP can produce to fold some comparisons
cc973cf
to
efe07d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of high level approach, an alternative would be to instead add the range(-1, 2)
return attribute to the intrinsic. This has the advantage that all places that understand range attributes will automatically gain support for this.
The downside is that doing this is not entirely straightforward. One could easily add the range to callsites in InstCombine, but it would be better to generate it directly on the declaration. That would be done by adding a new intrinsic property to https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/Intrinsics.td and then generating a call to the appropriate constructor in
void IntrinsicEmitter::EmitAttributes(const CodeGenIntrinsicTable &Ints, |
Intrinsic::getAttributes()
would need an extra FunctionType *
argument, so that the attributes for the correct type could be emitted.
I don't know how many places do use the range attribute to aid their optimizations, but if that number is not too high then I feel like the current approach should be preferred to inventing something new for a single use case. Additionally, I think that in many places where range information is used, the actual range can be shrunk depending on the ranges of the arguments (e.g. the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The point of encoding this in the intrinsic attributes is that it's not for a specific use case, but a general mechanism. Target-independent intrinsics with fixed result ranges are pretty rare, but this comes up for target-specific intrinsics as well. For example arm_mve_pred_v2i currently adds a fixed range attribute via an InstCombine hook: https://github.com/llvm/llvm-project/blob/c19028f364cceb4b2111c7956dcacbc257a96fd4/llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp#L195C19-L195C35 x86_sse42_crc32_64_64 instead adds a special case in ValueTracking: https://github.com/llvm/llvm-project/blob/c19028f364cceb4b2111c7956dcacbc257a96fd4/llvm/lib/Analysis/ValueTracking.cpp#L1732C23-L1732C44 We have some NVPTX intrinsics with fixed ranges as well. Anyway, we should still land this patch in the meantime. I may look into the generic mechanism myself. |
Do you have a plan to fold some existing patterns into |
I do, but probably after the main bulk of optimizations for them is done. |
… that `ucmp`/`scmp` can return (llvm#96410) This makes it possible to fold dumb comparisons like `ucmp(x, y) == 7`.
This makes it possible to fold dumb comparisons like
ucmp(x, y) == 7
.