Skip to content

[X86] getFauxShuffleMask - always match insert_subvector(insert_subvector(undef,sub,0),sub,c) 'subvector splat' patterns #130115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 6, 2025

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Mar 6, 2025

The plan is to remove the vXi64 cross lane shuffle constraint entirely, but this special case was easy to handle while I fight the remaining regressions.

@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2025

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

The plan is to remove the vXi64 constraint entirely, but this special case was easy to handle while I fight the remaining regressions.


Full diff: https://github.com/llvm/llvm-project/pull/130115.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+3-2)
  • (modified) llvm/test/CodeGen/X86/vector-partial-undef.ll (+1-2)
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 1697da09e0f72..e418bfa8b20e8 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -6153,12 +6153,13 @@ static bool getFauxShuffleMask(SDValue N, const APInt &DemandedElts,
       return true;
     }
     // Handle CONCAT(SUB0, SUB1).
-    // Limit this to vXi64 vector cases to make the most of cross lane shuffles.
+    // Limit this to vXi64/splat vector cases to make the most of cross lane shuffles.
     if (Depth > 0 && InsertIdx == NumSubElts && NumElts == (2 * NumSubElts) &&
-        NumBitsPerElt == 64 && Src.getOpcode() == ISD::INSERT_SUBVECTOR &&
+        Src.getOpcode() == ISD::INSERT_SUBVECTOR &&
         Src.getOperand(0).isUndef() &&
         Src.getOperand(1).getValueType() == SubVT &&
         Src.getConstantOperandVal(2) == 0 &&
+        (NumBitsPerElt == 64 || Src.getOperand(1) == Sub) &&
         SDNode::areOnlyUsersOf({N.getNode(), Src.getNode()}, Sub.getNode())) {
       for (int i = 0; i != (int)NumSubElts; ++i)
         Mask.push_back(i);
diff --git a/llvm/test/CodeGen/X86/vector-partial-undef.ll b/llvm/test/CodeGen/X86/vector-partial-undef.ll
index 4753dba2d468f..7c12e5295257c 100644
--- a/llvm/test/CodeGen/X86/vector-partial-undef.ll
+++ b/llvm/test/CodeGen/X86/vector-partial-undef.ll
@@ -150,8 +150,7 @@ define <8 x i32> @xor_undef_elts_alt(<4 x i32> %x) {
 ; AVX-LABEL: xor_undef_elts_alt:
 ; AVX:       # %bb.0:
 ; AVX-NEXT:    # kill: def $xmm0 killed $xmm0 def $ymm0
-; AVX-NEXT:    vinsertf128 $1, %xmm0, %ymm0, %ymm0
-; AVX-NEXT:    vmovaps {{.*#+}} ymm1 = [6,1,5,4,3,2,0,7]
+; AVX-NEXT:    vmovaps {{.*#+}} ymm1 = [2,1,1,0,3,2,0,3]
 ; AVX-NEXT:    vpermps %ymm0, %ymm1, %ymm0
 ; AVX-NEXT:    vxorps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm0, %ymm0
 ; AVX-NEXT:    retq

…ctor(undef,sub,0),sub,c) 'subvector splat' patterns

The plan is to remove the vXi64 constraint entirely, but this special case was easy to handle while I fight the remaining regressions.
@RKSimon RKSimon force-pushed the x86-faux-subvector-splat branch from efafd89 to 6a56610 Compare March 6, 2025 15:13
@RKSimon RKSimon merged commit ea59d17 into llvm:main Mar 6, 2025
8 of 10 checks passed
@RKSimon RKSimon deleted the x86-faux-subvector-splat branch March 6, 2025 16:06
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
…ctor(undef,sub,0),sub,c) 'subvector splat' patterns (llvm#130115)

The plan is to remove the vXi64 cross lane shuffle constraint entirely, but this special 'splat' case was easy to handle while I fight the remaining regressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants