-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[X86] Do not combine LRINT and TRUNC #125848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Per to discussions in llvm#125324, most participants are opposed to this optimization. So remove the combination to address the concerns. Fixes llvm#125324
@llvm/pr-subscribers-backend-x86 Author: Phoebe Wang (phoebewang) ChangesPer to discussions in #125324, most participants are opposed to this optimization. So remove the combination to address the concerns. Fixes #125324 Full diff: https://github.com/llvm/llvm-project/pull/125848.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 6cf6061deba702..5686d8bcbe85cd 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -53906,11 +53906,6 @@ static SDValue combineTruncate(SDNode *N, SelectionDAG &DAG,
return DAG.getNode(X86ISD::MMX_MOVD2W, DL, MVT::i32, BCSrc);
}
- // Try to combine (trunc (vNi64 (lrint x))) to (vNi32 (lrint x)).
- if (Src.getOpcode() == ISD::LRINT && VT.getScalarType() == MVT::i32 &&
- Src.hasOneUse())
- return DAG.getNode(ISD::LRINT, DL, VT, Src.getOperand(0));
-
return SDValue();
}
diff --git a/llvm/test/CodeGen/X86/lrint-conv-i64.ll b/llvm/test/CodeGen/X86/lrint-conv-i64.ll
index 01b0af2f807f20..38fa09085e1898 100644
--- a/llvm/test/CodeGen/X86/lrint-conv-i64.ll
+++ b/llvm/test/CodeGen/X86/lrint-conv-i64.ll
@@ -45,6 +45,24 @@ entry:
ret i64 %0
}
+define i32 @PR125324(float %x) {
+; SSE-LABEL: PR125324:
+; SSE: # %bb.0: # %entry
+; SSE-NEXT: cvtss2si %xmm0, %rax
+; SSE-NEXT: # kill: def $eax killed $eax killed $rax
+; SSE-NEXT: retq
+;
+; AVX-LABEL: PR125324:
+; AVX: # %bb.0: # %entry
+; AVX-NEXT: vcvtss2si %xmm0, %rax
+; AVX-NEXT: # kill: def $eax killed $eax killed $rax
+; AVX-NEXT: retq
+entry:
+ %0 = tail call i64 @llvm.lrint.i64.f32(float %x)
+ %1 = trunc i64 %0 to i32
+ ret i32 %1
+}
+
declare i64 @llvm.lrint.i64.f32(float) nounwind readnone
declare i64 @llvm.lrint.i64.f64(double) nounwind readnone
declare i64 @llvm.lrint.i64.f80(x86_fp80) nounwind readnone
|
@@ -45,6 +45,24 @@ entry: | |||
ret i64 %0 | |||
} | |||
|
|||
define i32 @PR125324(float %x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean we had no tests for this combine before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's true 🤦♀️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/cherry-pick 8c222c1 |
/pull-request #125995 |
Try to improve performance after llvm#125848
Per to discussions in llvm#125324, most participants are opposed to this optimization. So remove the combination to address the concerns. Fixes llvm#125324 (cherry picked from commit 8c222c1)
Per to discussions in llvm#125324, most participants are opposed to this optimization. So remove the combination to address the concerns. Fixes llvm#125324
Per to discussions in llvm#125324, most participants are opposed to this optimization. So remove the combination to address the concerns. Fixes llvm#125324 (cherry picked from commit 8c222c1)
Per to discussions in llvm#125324, most participants are opposed to this optimization. So remove the combination to address the concerns. Fixes llvm#125324 (cherry picked from commit 8c222c1) Co-authored-by: Phoebe Wang <[email protected]>
Per to discussions in #125324, most participants are opposed to this optimization. So remove the combination to address the concerns.
Fixes #125324