[X86] combineConcatVectorOps - convert X86ISD::PALIGNR concatenation to use combineConcatVectorOps recursion #130572

RKSimon · 2025-03-10T10:18:09Z

Only concatenate X86ISD::PALIGNR nodes if at least one operand is beneficial to concatenate

…to use combineConcatVectorOps recursion Only concatenate X86ISD::PALIGNR nodes if at least one operand is beneficial to concatenate

llvmbot · 2025-03-10T10:18:48Z

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

Only concatenate X86ISD::PALIGNR nodes if at least one operand is beneficial to concatenate

Full diff: https://github.com/llvm/llvm-project/pull/130572.diff

2 Files Affected:

(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+7-3)
(modified) llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll (+4-6)

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index d83033d24bdbb..ccd7f2418fcd1 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -58443,9 +58443,13 @@ static SDValue combineConcatVectorOps(const SDLoc &DL, MVT VT,
           llvm::all_of(Ops, [Op0](SDValue Op) {
             return Op0.getOperand(2) == Op.getOperand(2);
           })) {
-        return DAG.getNode(Op0.getOpcode(), DL, VT,
-                           ConcatSubOperand(VT, Ops, 0),
-                           ConcatSubOperand(VT, Ops, 1), Op0.getOperand(2));
+        SDValue Concat0 = CombineSubOperand(VT, Ops, 0);
+        SDValue Concat1 = CombineSubOperand(VT, Ops, 1);
+        if (Concat0 || Concat1)
+          return DAG.getNode(Op0.getOpcode(), DL, VT,
+                             Concat0 ? Concat0 : ConcatSubOperand(VT, Ops, 0),
+                             Concat1 ? Concat1 : ConcatSubOperand(VT, Ops, 1),
+                             Op0.getOperand(2));
       }
       break;
     case X86ISD::BLENDI:
diff --git a/llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll b/llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll
index a6e66af77dd20..399f137676fb4 100644
--- a/llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll
+++ b/llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll
@@ -775,15 +775,13 @@ define <32 x i8> @combine_pshufb_pshufb_or_pshufb(<32 x i8> %a0) {
   ret <32 x i8> %4
 }
 
-; TODO: Not beneficial to concatenate both inputs just to create a 256-bit palignr
+; Not beneficial to concatenate both inputs just to create a 256-bit palignr
 define <32 x i8> @concat_alignr_unnecessary(<16 x i8> %a0, <16 x i8> noundef %a1, <16 x i8> %a2) nounwind {
 ; CHECK-LABEL: concat_alignr_unnecessary:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    # kill: def $xmm1 killed $xmm1 def $ymm1
-; CHECK-NEXT:    # kill: def $xmm0 killed $xmm0 def $ymm0
-; CHECK-NEXT:    vinserti128 $1, %xmm2, %ymm1, %ymm1
-; CHECK-NEXT:    vinserti128 $1, %xmm0, %ymm0, %ymm0
-; CHECK-NEXT:    vpalignr {{.*#+}} ymm0 = ymm1[3,4,5,6,7,8,9,10,11,12,13,14,15],ymm0[0,1,2],ymm1[19,20,21,22,23,24,25,26,27,28,29,30,31],ymm0[16,17,18]
+; CHECK-NEXT:    vpalignr {{.*#+}} xmm1 = xmm1[3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1,2]
+; CHECK-NEXT:    vpalignr {{.*#+}} xmm0 = xmm2[3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1,2]
+; CHECK-NEXT:    vinserti128 $1, %xmm0, %ymm1, %ymm0
 ; CHECK-NEXT:    ret{{[l|q]}}
   %lo = shufflevector <16 x i8> %a1, <16 x i8> %a0, <16 x i32> <i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18>
   %hi = shufflevector <16 x i8> %a2, <16 x i8> %a0, <16 x i32> <i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18>

[X86] combineConcatVectorOps - convert X86ISD::PALIGNR concatenation …

5550478

…to use combineConcatVectorOps recursion Only concatenate X86ISD::PALIGNR nodes if at least one operand is beneficial to concatenate

llvmbot added the backend:X86 label Mar 10, 2025

RKSimon merged commit 99fdb5d into llvm:main Mar 10, 2025
11 of 13 checks passed

RKSimon deleted the x86-concat-palignr branch March 10, 2025 11:32

shiltian mentioned this pull request Mar 10, 2025

[AMDGPU] Fix test failures when expensive checks are enabled #130644

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[X86] combineConcatVectorOps - convert X86ISD::PALIGNR concatenation to use combineConcatVectorOps recursion #130572

[X86] combineConcatVectorOps - convert X86ISD::PALIGNR concatenation to use combineConcatVectorOps recursion #130572

Uh oh!

RKSimon commented Mar 10, 2025

Uh oh!

llvmbot commented Mar 10, 2025

Uh oh!

Uh oh!

Uh oh!

[X86] combineConcatVectorOps - convert X86ISD::PALIGNR concatenation to use combineConcatVectorOps recursion #130572

[X86] combineConcatVectorOps - convert X86ISD::PALIGNR concatenation to use combineConcatVectorOps recursion #130572

Uh oh!

Conversation

RKSimon commented Mar 10, 2025

Uh oh!

llvmbot commented Mar 10, 2025

Uh oh!

Uh oh!

Uh oh!