-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DAG] getOperationAction - always return Custom action for target opcodes #95401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…odes The target must be responsible for any expansion that needs to be performed for extended types etc. Fixes llvm#95274
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-x86 Author: Simon Pilgrim (RKSimon) ChangesThe target must be responsible for any expansion that needs to be performed for extended types etc. Fixes #95274 Full diff: https://github.com/llvm/llvm-project/pull/95401.diff 2 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 167e69afb548c..3074ece787a08 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -1255,11 +1255,12 @@ class TargetLoweringBase {
/// be promoted to a larger size, needs to be expanded to some other code
/// sequence, or the target has a custom expander for it.
LegalizeAction getOperationAction(unsigned Op, EVT VT) const {
- if (VT.isExtended()) return Expand;
// If a target-specific SDNode requires legalization, require the target
// to provide custom legalization for it.
if (Op >= std::size(OpActions[0]))
return Custom;
+ if (VT.isExtended())
+ return Expand;
return OpActions[(unsigned)VT.getSimpleVT().SimpleTy][Op];
}
diff --git a/llvm/test/CodeGen/X86/pr95274.ll b/llvm/test/CodeGen/X86/pr95274.ll
new file mode 100644
index 0000000000000..bc43a47a28538
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr95274.ll
@@ -0,0 +1,47 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=x86_64-- -mcpu=skylake-avx512 | FileCheck %s
+
+define void @PR95274(ptr %p0) nounwind {
+; CHECK-LABEL: PR95274:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vcvtps2ph $4, %zmm0, %ymm0
+; CHECK-NEXT: vinsertf64x4 $1, %ymm0, %zmm0, %zmm0
+; CHECK-NEXT: vmovups %zmm0, 1984(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1920(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1856(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1792(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1728(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1664(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1600(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1536(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1472(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1408(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1344(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1280(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1216(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1152(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1088(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 1024(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 960(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 896(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 832(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 768(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 704(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 640(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 576(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 512(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 448(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 384(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 320(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 256(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 192(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 128(%rdi)
+; CHECK-NEXT: vmovups %zmm0, 64(%rdi)
+; CHECK-NEXT: vmovups %zmm0, (%rdi)
+; CHECK-NEXT: vzeroupper
+; CHECK-NEXT: retq
+ %load = load <1024 x float>, ptr %p0, align 2
+ %trunc = fptrunc <1024 x float> poison to <1024 x half>
+ store <1024 x half> %trunc, ptr %p0, align 2
+ ret void
+}
|
This can be solve in this way too:
Which looks less destructive to me. |
I'm not sure I agree, always treating target opcodes as Custom makes more general sense to me. Tweaking combineFP_ROUND is just avoiding the problem surely? |
Requesting Expand for target opcodes makes no sense under any circumstances. The generic expand code can never handle them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. LGTM then.
…odes (llvm#95401) The target must be responsible for any expansion that needs to be performed for extended types etc. Fixes llvm#95274
The target must be responsible for any expansion that needs to be performed for extended types etc.
Fixes #95274