-
Notifications
You must be signed in to change notification settings - Fork 14.3k
AMDGPU: Add noundef to mbcnt intrinsic returns #136304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Add noundef to mbcnt intrinsic returns #136304
Conversation
I started to add the range, but it probably can't be done on the declaration due to the second added operand.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesI started to add the range, but it probably can't be done on Full diff: https://github.com/llvm/llvm-project/pull/136304.diff 2 Files Affected:
diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index a57eb4a6dba49..9803693253853 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -2215,15 +2215,15 @@ def int_amdgcn_live_mask : DefaultAttrsIntrinsic <[llvm_i1_ty],
[], [NoUndef<RetIndex>, IntrReadMem, IntrInaccessibleMemOnly]
>;
-def int_amdgcn_mbcnt_lo :
- ClangBuiltin<"__builtin_amdgcn_mbcnt_lo">,
- DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty],
- [IntrNoMem]>;
-
-def int_amdgcn_mbcnt_hi :
- ClangBuiltin<"__builtin_amdgcn_mbcnt_hi">,
- DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty],
- [IntrNoMem]>;
+def int_amdgcn_mbcnt_lo
+ : ClangBuiltin<"__builtin_amdgcn_mbcnt_lo">,
+ DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty],
+ [NoUndef<RetIndex>, IntrNoMem]>;
+
+def int_amdgcn_mbcnt_hi
+ : ClangBuiltin<"__builtin_amdgcn_mbcnt_hi">,
+ DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty],
+ [NoUndef<RetIndex>, IntrNoMem]>;
// llvm.amdgcn.ds.swizzle src offset
def int_amdgcn_ds_swizzle :
diff --git a/llvm/test/Assembler/amdgcn-intrinsic-attributes.ll b/llvm/test/Assembler/amdgcn-intrinsic-attributes.ll
index 744c94ac85410..b965bef634f9d 100644
--- a/llvm/test/Assembler/amdgcn-intrinsic-attributes.ll
+++ b/llvm/test/Assembler/amdgcn-intrinsic-attributes.ll
@@ -18,12 +18,25 @@ define i32 @ds_consume(ptr addrspace(3) %ptr) {
ret i32 %ret
}
+; CHECK: declare noundef i32 @llvm.amdgcn.mbcnt.hi(i32, i32) #1
+define i32 @mbcnt_hi(i32 %a, i32 %b) {
+ %ret = call i32 @llvm.amdgcn.mbcnt.hi(i32 %a, i32 %b)
+ ret i32 %ret
+}
+
+; CHECK: declare noundef i32 @llvm.amdgcn.mbcnt.lo(i32, i32) #1
+define i32 @mbcnt_lo(i32 %a, i32 %b) {
+ %ret = call i32 @llvm.amdgcn.mbcnt.lo(i32 %a, i32 %b)
+ ret i32 %ret
+}
+
; Test assumed range
-; CHECK: declare noundef range(i32 32, 65) i32 @llvm.amdgcn.wavefrontsize() #1
+; CHECK: declare noundef range(i32 32, 65) i32 @llvm.amdgcn.wavefrontsize() #2
define i32 @wavefrontsize() {
%ret = call i32 @llvm.amdgcn.wavefrontsize()
ret i32 %ret
}
; CHECK: attributes #0 = { convergent nocallback nofree nounwind willreturn memory(argmem: readwrite) }
-; CHECK: attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
+; CHECK: attributes #1 = { nocallback nofree nosync nounwind willreturn memory(none) }
+; CHECK: attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the output gonna be if the input is undef
?
For mbcnt lo, An unspecified value between 0 and 32. There's an embedded freeze, the output value is more constrained than the input. For mbcnt_hi I'm less clear on how to interpret src1 |
I think the add just makes this wrong |
I started to add the range, but it probably can't be done on
the declaration due to the second added operand.