Skip to content

[DAG] canCreateUndefOrPoison - add EXTRACT_SUBVECTOR handling #132745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2025

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Mar 24, 2025

Similar to INSERT_SUBVECTOR - the index is constant and will be inbounds

@RKSimon RKSimon requested review from preames, topperc and bjope March 24, 2025 14:32
@llvmbot llvmbot added backend:X86 llvm:SelectionDAG SelectionDAGISel as well labels Mar 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 24, 2025

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

Similar to INSERT_SUBVECTOR - the index is constant and will be inbounds


Patch is 22.82 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132745.diff

7 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+1)
  • (modified) llvm/test/CodeGen/X86/lower-vec-shift.ll (+3-3)
  • (modified) llvm/test/CodeGen/X86/midpoint-int-vec-256.ll (+45-45)
  • (modified) llvm/test/CodeGen/X86/pr62286.ll (+7-10)
  • (modified) llvm/test/CodeGen/X86/shift-i512.ll (+19-24)
  • (modified) llvm/test/CodeGen/X86/vector-fshr-256.ll (+24-24)
  • (modified) llvm/test/CodeGen/X86/vector-gep.ll (+66-62)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index d1f92c9ef00e9..610e159be96bd 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -5490,6 +5490,7 @@ bool SelectionDAG::canCreateUndefOrPoison(SDValue Op, const APInt &DemandedElts,
   case ISD::FREEZE:
   case ISD::CONCAT_VECTORS:
   case ISD::INSERT_SUBVECTOR:
+  case ISD::EXTRACT_SUBVECTOR:
   case ISD::SADDSAT:
   case ISD::UADDSAT:
   case ISD::SSUBSAT:
diff --git a/llvm/test/CodeGen/X86/lower-vec-shift.ll b/llvm/test/CodeGen/X86/lower-vec-shift.ll
index 9d4935ef564de..9ba6f00f532b2 100644
--- a/llvm/test/CodeGen/X86/lower-vec-shift.ll
+++ b/llvm/test/CodeGen/X86/lower-vec-shift.ll
@@ -265,9 +265,9 @@ define <16 x i16> @test11(<16 x i16> %a) {
 ; AVX1-LABEL: test11:
 ; AVX1:       # %bb.0:
 ; AVX1-NEXT:    vextractf128 $1, %ymm0, %xmm1
-; AVX1-NEXT:    vpsllw $3, %xmm1, %xmm2
-; AVX1-NEXT:    vpaddw %xmm1, %xmm1, %xmm1
-; AVX1-NEXT:    vpblendw {{.*#+}} xmm1 = xmm2[0,1,2],xmm1[3,4,5],xmm2[6],xmm1[7]
+; AVX1-NEXT:    vpaddw %xmm1, %xmm1, %xmm2
+; AVX1-NEXT:    vpsllw $3, %xmm1, %xmm1
+; AVX1-NEXT:    vpblendw {{.*#+}} xmm1 = xmm1[0,1,2],xmm2[3,4,5],xmm1[6],xmm2[7]
 ; AVX1-NEXT:    vpsllw $3, %xmm0, %xmm2
 ; AVX1-NEXT:    vpaddw %xmm0, %xmm0, %xmm0
 ; AVX1-NEXT:    vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2,3,4],xmm2[5,6,7]
diff --git a/llvm/test/CodeGen/X86/midpoint-int-vec-256.ll b/llvm/test/CodeGen/X86/midpoint-int-vec-256.ll
index f6dc470c18a10..a8021e3164f34 100644
--- a/llvm/test/CodeGen/X86/midpoint-int-vec-256.ll
+++ b/llvm/test/CodeGen/X86/midpoint-int-vec-256.ll
@@ -231,22 +231,22 @@ define <8 x i32> @vec256_i32_signed_mem_reg(ptr %a1_addr, <8 x i32> %a2) nounwin
 define <8 x i32> @vec256_i32_signed_reg_mem(<8 x i32> %a1, ptr %a2_addr) nounwind {
 ; AVX1-LABEL: vec256_i32_signed_reg_mem:
 ; AVX1:       # %bb.0:
-; AVX1-NEXT:    vmovdqa (%rdi), %xmm1
-; AVX1-NEXT:    vmovdqa 16(%rdi), %xmm2
-; AVX1-NEXT:    vpminsd %xmm1, %xmm0, %xmm3
-; AVX1-NEXT:    vpmaxsd %xmm1, %xmm0, %xmm1
-; AVX1-NEXT:    vpsubd %xmm3, %xmm1, %xmm1
+; AVX1-NEXT:    vmovdqa (%rdi), %ymm1
+; AVX1-NEXT:    vpminsd %xmm1, %xmm0, %xmm2
+; AVX1-NEXT:    vpmaxsd %xmm1, %xmm0, %xmm3
+; AVX1-NEXT:    vpsubd %xmm2, %xmm3, %xmm2
 ; AVX1-NEXT:    vextractf128 $1, %ymm0, %xmm3
-; AVX1-NEXT:    vpminsd %xmm2, %xmm3, %xmm4
-; AVX1-NEXT:    vpmaxsd %xmm2, %xmm3, %xmm2
-; AVX1-NEXT:    vpsubd %xmm4, %xmm2, %xmm2
-; AVX1-NEXT:    vpsrld $1, %xmm2, %xmm2
+; AVX1-NEXT:    vextractf128 $1, %ymm1, %xmm1
+; AVX1-NEXT:    vpminsd %xmm1, %xmm3, %xmm4
+; AVX1-NEXT:    vpmaxsd %xmm1, %xmm3, %xmm1
+; AVX1-NEXT:    vpsubd %xmm4, %xmm1, %xmm1
 ; AVX1-NEXT:    vpsrld $1, %xmm1, %xmm1
-; AVX1-NEXT:    vpmulld %xmm1, %xmm1, %xmm1
+; AVX1-NEXT:    vpsrld $1, %xmm2, %xmm2
 ; AVX1-NEXT:    vpmulld %xmm2, %xmm2, %xmm2
-; AVX1-NEXT:    vpaddd %xmm3, %xmm2, %xmm2
-; AVX1-NEXT:    vpaddd %xmm0, %xmm1, %xmm0
-; AVX1-NEXT:    vinsertf128 $1, %xmm2, %ymm0, %ymm0
+; AVX1-NEXT:    vpmulld %xmm1, %xmm1, %xmm1
+; AVX1-NEXT:    vpaddd %xmm3, %xmm1, %xmm1
+; AVX1-NEXT:    vpaddd %xmm0, %xmm2, %xmm0
+; AVX1-NEXT:    vinsertf128 $1, %xmm1, %ymm0, %ymm0
 ; AVX1-NEXT:    retq
 ;
 ; AVX2-LABEL: vec256_i32_signed_reg_mem:
@@ -262,18 +262,18 @@ define <8 x i32> @vec256_i32_signed_reg_mem(<8 x i32> %a1, ptr %a2_addr) nounwin
 ;
 ; XOP-LABEL: vec256_i32_signed_reg_mem:
 ; XOP:       # %bb.0:
-; XOP-NEXT:    vmovdqa (%rdi), %xmm1
-; XOP-NEXT:    vmovdqa 16(%rdi), %xmm2
-; XOP-NEXT:    vextractf128 $1, %ymm0, %xmm3
-; XOP-NEXT:    vpminsd %xmm2, %xmm3, %xmm4
-; XOP-NEXT:    vpmaxsd %xmm2, %xmm3, %xmm2
-; XOP-NEXT:    vpsubd %xmm4, %xmm2, %xmm2
+; XOP-NEXT:    vmovdqa (%rdi), %ymm1
+; XOP-NEXT:    vextractf128 $1, %ymm0, %xmm2
+; XOP-NEXT:    vextractf128 $1, %ymm1, %xmm3
+; XOP-NEXT:    vpminsd %xmm3, %xmm2, %xmm4
+; XOP-NEXT:    vpmaxsd %xmm3, %xmm2, %xmm3
+; XOP-NEXT:    vpsubd %xmm4, %xmm3, %xmm3
 ; XOP-NEXT:    vpminsd %xmm1, %xmm0, %xmm4
 ; XOP-NEXT:    vpmaxsd %xmm1, %xmm0, %xmm1
 ; XOP-NEXT:    vpsubd %xmm4, %xmm1, %xmm1
 ; XOP-NEXT:    vpsrld $1, %xmm1, %xmm1
-; XOP-NEXT:    vpsrld $1, %xmm2, %xmm2
-; XOP-NEXT:    vpmacsdd %xmm3, %xmm2, %xmm2, %xmm2
+; XOP-NEXT:    vpsrld $1, %xmm3, %xmm3
+; XOP-NEXT:    vpmacsdd %xmm2, %xmm3, %xmm3, %xmm2
 ; XOP-NEXT:    vpmacsdd %xmm0, %xmm1, %xmm1, %xmm0
 ; XOP-NEXT:    vinsertf128 $1, %xmm2, %ymm0, %ymm0
 ; XOP-NEXT:    retq
@@ -303,23 +303,23 @@ define <8 x i32> @vec256_i32_signed_reg_mem(<8 x i32> %a1, ptr %a2_addr) nounwin
 define <8 x i32> @vec256_i32_signed_mem_mem(ptr %a1_addr, ptr %a2_addr) nounwind {
 ; AVX1-LABEL: vec256_i32_signed_mem_mem:
 ; AVX1:       # %bb.0:
-; AVX1-NEXT:    vmovdqa (%rsi), %xmm0
-; AVX1-NEXT:    vmovdqa 16(%rsi), %xmm1
-; AVX1-NEXT:    vmovdqa (%rdi), %xmm2
-; AVX1-NEXT:    vmovdqa 16(%rdi), %xmm3
+; AVX1-NEXT:    vmovdqa (%rsi), %ymm0
+; AVX1-NEXT:    vmovdqa (%rdi), %xmm1
+; AVX1-NEXT:    vmovdqa 16(%rdi), %xmm2
+; AVX1-NEXT:    vpminsd %xmm0, %xmm1, %xmm3
+; AVX1-NEXT:    vpmaxsd %xmm0, %xmm1, %xmm4
+; AVX1-NEXT:    vpsubd %xmm3, %xmm4, %xmm3
+; AVX1-NEXT:    vextractf128 $1, %ymm0, %xmm0
 ; AVX1-NEXT:    vpminsd %xmm0, %xmm2, %xmm4
 ; AVX1-NEXT:    vpmaxsd %xmm0, %xmm2, %xmm0
 ; AVX1-NEXT:    vpsubd %xmm4, %xmm0, %xmm0
-; AVX1-NEXT:    vpminsd %xmm1, %xmm3, %xmm4
-; AVX1-NEXT:    vpmaxsd %xmm1, %xmm3, %xmm1
-; AVX1-NEXT:    vpsubd %xmm4, %xmm1, %xmm1
-; AVX1-NEXT:    vpsrld $1, %xmm1, %xmm1
 ; AVX1-NEXT:    vpsrld $1, %xmm0, %xmm0
+; AVX1-NEXT:    vpsrld $1, %xmm3, %xmm3
+; AVX1-NEXT:    vpmulld %xmm3, %xmm3, %xmm3
 ; AVX1-NEXT:    vpmulld %xmm0, %xmm0, %xmm0
-; AVX1-NEXT:    vpmulld %xmm1, %xmm1, %xmm1
-; AVX1-NEXT:    vpaddd %xmm3, %xmm1, %xmm1
 ; AVX1-NEXT:    vpaddd %xmm2, %xmm0, %xmm0
-; AVX1-NEXT:    vinsertf128 $1, %xmm1, %ymm0, %ymm0
+; AVX1-NEXT:    vpaddd %xmm1, %xmm3, %xmm1
+; AVX1-NEXT:    vinsertf128 $1, %xmm0, %ymm1, %ymm0
 ; AVX1-NEXT:    retq
 ;
 ; AVX2-LABEL: vec256_i32_signed_mem_mem:
@@ -336,21 +336,21 @@ define <8 x i32> @vec256_i32_signed_mem_mem(ptr %a1_addr, ptr %a2_addr) nounwind
 ;
 ; XOP-LABEL: vec256_i32_signed_mem_mem:
 ; XOP:       # %bb.0:
-; XOP-NEXT:    vmovdqa (%rsi), %xmm0
-; XOP-NEXT:    vmovdqa 16(%rsi), %xmm1
-; XOP-NEXT:    vmovdqa (%rdi), %xmm2
-; XOP-NEXT:    vmovdqa 16(%rdi), %xmm3
-; XOP-NEXT:    vpminsd %xmm1, %xmm3, %xmm4
-; XOP-NEXT:    vpmaxsd %xmm1, %xmm3, %xmm1
-; XOP-NEXT:    vpsubd %xmm4, %xmm1, %xmm1
-; XOP-NEXT:    vpminsd %xmm0, %xmm2, %xmm4
-; XOP-NEXT:    vpmaxsd %xmm0, %xmm2, %xmm0
+; XOP-NEXT:    vmovdqa (%rsi), %ymm0
+; XOP-NEXT:    vmovdqa (%rdi), %xmm1
+; XOP-NEXT:    vmovdqa 16(%rdi), %xmm2
+; XOP-NEXT:    vextractf128 $1, %ymm0, %xmm3
+; XOP-NEXT:    vpminsd %xmm3, %xmm2, %xmm4
+; XOP-NEXT:    vpmaxsd %xmm3, %xmm2, %xmm3
+; XOP-NEXT:    vpsubd %xmm4, %xmm3, %xmm3
+; XOP-NEXT:    vpminsd %xmm0, %xmm1, %xmm4
+; XOP-NEXT:    vpmaxsd %xmm0, %xmm1, %xmm0
 ; XOP-NEXT:    vpsubd %xmm4, %xmm0, %xmm0
 ; XOP-NEXT:    vpsrld $1, %xmm0, %xmm0
-; XOP-NEXT:    vpsrld $1, %xmm1, %xmm1
-; XOP-NEXT:    vpmacsdd %xmm3, %xmm1, %xmm1, %xmm1
-; XOP-NEXT:    vpmacsdd %xmm2, %xmm0, %xmm0, %xmm0
-; XOP-NEXT:    vinsertf128 $1, %xmm1, %ymm0, %ymm0
+; XOP-NEXT:    vpsrld $1, %xmm3, %xmm3
+; XOP-NEXT:    vpmacsdd %xmm2, %xmm3, %xmm3, %xmm2
+; XOP-NEXT:    vpmacsdd %xmm1, %xmm0, %xmm0, %xmm0
+; XOP-NEXT:    vinsertf128 $1, %xmm2, %ymm0, %ymm0
 ; XOP-NEXT:    retq
 ;
 ; AVX512-LABEL: vec256_i32_signed_mem_mem:
diff --git a/llvm/test/CodeGen/X86/pr62286.ll b/llvm/test/CodeGen/X86/pr62286.ll
index e595b3f3449e2..2d1b7fcbf0239 100644
--- a/llvm/test/CodeGen/X86/pr62286.ll
+++ b/llvm/test/CodeGen/X86/pr62286.ll
@@ -26,20 +26,17 @@ define i64 @PR62286(i32 %a) {
 ; AVX1-LABEL: PR62286:
 ; AVX1:       # %bb.0:
 ; AVX1-NEXT:    vmovd %edi, %xmm0
-; AVX1-NEXT:    vpshufd {{.*#+}} xmm0 = xmm0[0,1,1,0]
-; AVX1-NEXT:    vpaddd %xmm0, %xmm0, %xmm1
-; AVX1-NEXT:    vandps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ymm0, %ymm0
-; AVX1-NEXT:    vblendps {{.*#+}} ymm0 = ymm1[0],ymm0[1,2,3],ymm1[4],ymm0[5,6,7]
-; AVX1-NEXT:    vextractf128 $1, %ymm0, %xmm1
-; AVX1-NEXT:    vorps %xmm1, %xmm0, %xmm0
-; AVX1-NEXT:    vpmovsxdq %xmm0, %xmm1
-; AVX1-NEXT:    vpshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]
+; AVX1-NEXT:    vpslldq {{.*#+}} xmm1 = zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,xmm0[0,1,2,3]
+; AVX1-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
+; AVX1-NEXT:    vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm1
+; AVX1-NEXT:    vpblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2,3,4,5,6,7]
 ; AVX1-NEXT:    vpmovsxdq %xmm0, %xmm0
-; AVX1-NEXT:    vpaddq %xmm0, %xmm1, %xmm0
+; AVX1-NEXT:    vpshufd {{.*#+}} xmm1 = xmm1[2,3,2,3]
+; AVX1-NEXT:    vpmovsxdq %xmm1, %xmm1
+; AVX1-NEXT:    vpaddq %xmm1, %xmm0, %xmm0
 ; AVX1-NEXT:    vpshufd {{.*#+}} xmm1 = xmm0[2,3,2,3]
 ; AVX1-NEXT:    vpaddq %xmm1, %xmm0, %xmm0
 ; AVX1-NEXT:    vmovq %xmm0, %rax
-; AVX1-NEXT:    vzeroupper
 ; AVX1-NEXT:    retq
 ;
 ; AVX2-LABEL: PR62286:
diff --git a/llvm/test/CodeGen/X86/shift-i512.ll b/llvm/test/CodeGen/X86/shift-i512.ll
index d5ba863a89942..aedf111941dd4 100644
--- a/llvm/test/CodeGen/X86/shift-i512.ll
+++ b/llvm/test/CodeGen/X86/shift-i512.ll
@@ -8,39 +8,34 @@
 define <8 x i64> @shl_i512_1(<8 x i64> %a)  {
 ; AVX512VL-LABEL: shl_i512_1:
 ; AVX512VL:       # %bb.0:
-; AVX512VL-NEXT:    vextracti128 $1, %ymm0, %xmm1
+; AVX512VL-NEXT:    valignq {{.*#+}} zmm1 = zmm0[3,4,5,6,7,0,1,2]
 ; AVX512VL-NEXT:    vextracti32x4 $3, %zmm0, %xmm2
 ; AVX512VL-NEXT:    vextracti32x4 $2, %zmm0, %xmm3
-; AVX512VL-NEXT:    vpaddq %xmm3, %xmm3, %xmm4
-; AVX512VL-NEXT:    vpaddq %xmm2, %xmm2, %xmm5
-; AVX512VL-NEXT:    vinserti128 $1, %xmm5, %ymm4, %ymm4
-; AVX512VL-NEXT:    vinserti128 $1, %xmm3, %ymm1, %ymm5
-; AVX512VL-NEXT:    vpshufd {{.*#+}} ymm5 = ymm5[2,3,2,3,6,7,6,7]
-; AVX512VL-NEXT:    vpsrlq $63, %ymm5, %ymm5
-; AVX512VL-NEXT:    vpor %ymm5, %ymm4, %ymm4
+; AVX512VL-NEXT:    vextracti128 $1, %ymm0, %xmm4
 ; AVX512VL-NEXT:    vpsllq $1, %xmm0, %xmm5
 ; AVX512VL-NEXT:    vpshufd {{.*#+}} xmm6 = xmm0[2,3,2,3]
 ; AVX512VL-NEXT:    vpsrlq $63, %xmm6, %xmm7
-; AVX512VL-NEXT:    vpaddq %xmm1, %xmm1, %xmm8
+; AVX512VL-NEXT:    vpaddq %xmm4, %xmm4, %xmm8
 ; AVX512VL-NEXT:    vpor %xmm7, %xmm8, %xmm7
 ; AVX512VL-NEXT:    vinserti128 $1, %xmm7, %ymm5, %ymm5
-; AVX512VL-NEXT:    vinserti64x4 $1, %ymm4, %zmm5, %zmm4
-; AVX512VL-NEXT:    vpshufd {{.*#+}} xmm5 = xmm3[2,3,2,3]
-; AVX512VL-NEXT:    vpaddq %xmm5, %xmm5, %xmm5
-; AVX512VL-NEXT:    vpshufd {{.*#+}} xmm7 = xmm2[2,3,2,3]
-; AVX512VL-NEXT:    vpaddq %xmm7, %xmm7, %xmm7
-; AVX512VL-NEXT:    vinserti128 $1, %xmm7, %ymm5, %ymm5
-; AVX512VL-NEXT:    vpaddq %xmm6, %xmm6, %xmm6
-; AVX512VL-NEXT:    vpshufd {{.*#+}} xmm7 = xmm1[2,3,2,3]
-; AVX512VL-NEXT:    vpaddq %xmm7, %xmm7, %xmm7
-; AVX512VL-NEXT:    vinserti128 $1, %xmm7, %ymm6, %ymm6
-; AVX512VL-NEXT:    vinserti64x4 $1, %ymm5, %zmm6, %zmm5
+; AVX512VL-NEXT:    vextracti64x4 $1, %zmm0, %ymm7
+; AVX512VL-NEXT:    vpaddq %ymm7, %ymm7, %ymm7
+; AVX512VL-NEXT:    vpsrlq $63, %ymm1, %ymm1
+; AVX512VL-NEXT:    vpor %ymm1, %ymm7, %ymm1
+; AVX512VL-NEXT:    vinserti64x4 $1, %ymm1, %zmm5, %zmm1
+; AVX512VL-NEXT:    vpshufd {{.*#+}} xmm3 = xmm3[2,3,2,3]
+; AVX512VL-NEXT:    vpaddq %xmm3, %xmm3, %xmm3
+; AVX512VL-NEXT:    vpshufd {{.*#+}} xmm2 = xmm2[2,3,2,3]
+; AVX512VL-NEXT:    vpaddq %xmm2, %xmm2, %xmm2
 ; AVX512VL-NEXT:    vinserti128 $1, %xmm2, %ymm3, %ymm2
-; AVX512VL-NEXT:    vinserti128 $1, %xmm1, %ymm0, %ymm0
-; AVX512VL-NEXT:    vinserti64x4 $1, %ymm2, %zmm0, %zmm0
+; AVX512VL-NEXT:    vpaddq %xmm6, %xmm6, %xmm3
+; AVX512VL-NEXT:    vpshufd {{.*#+}} xmm4 = xmm4[2,3,2,3]
+; AVX512VL-NEXT:    vpaddq %xmm4, %xmm4, %xmm4
+; AVX512VL-NEXT:    vinserti128 $1, %xmm4, %ymm3, %ymm3
+; AVX512VL-NEXT:    vinserti64x4 $1, %ymm2, %zmm3, %zmm2
 ; AVX512VL-NEXT:    vpsrlq $63, %zmm0, %zmm0
-; AVX512VL-NEXT:    vporq %zmm0, %zmm5, %zmm0
-; AVX512VL-NEXT:    vpunpcklqdq {{.*#+}} zmm0 = zmm4[0],zmm0[0],zmm4[2],zmm0[2],zmm4[4],zmm0[4],zmm4[6],zmm0[6]
+; AVX512VL-NEXT:    vporq %zmm0, %zmm2, %zmm0
+; AVX512VL-NEXT:    vpunpcklqdq {{.*#+}} zmm0 = zmm1[0],zmm0[0],zmm1[2],zmm0[2],zmm1[4],zmm0[4],zmm1[6],zmm0[6]
 ; AVX512VL-NEXT:    retq
 ;
 ; AVX512VBMI-LABEL: shl_i512_1:
diff --git a/llvm/test/CodeGen/X86/vector-fshr-256.ll b/llvm/test/CodeGen/X86/vector-fshr-256.ll
index 6d7d279a3637a..35c707eac83b4 100644
--- a/llvm/test/CodeGen/X86/vector-fshr-256.ll
+++ b/llvm/test/CodeGen/X86/vector-fshr-256.ll
@@ -39,17 +39,17 @@ define <4 x i64> @var_funnnel_v4i64(<4 x i64> %x, <4 x i64> %y, <4 x i64> %amt)
 ; AVX1-NEXT:    vinsertf128 $1, %xmm5, %ymm1, %ymm1
 ; AVX1-NEXT:    vandnps %ymm3, %ymm2, %ymm2
 ; AVX1-NEXT:    vextractf128 $1, %ymm2, %xmm3
-; AVX1-NEXT:    vpshufd {{.*#+}} xmm4 = xmm3[2,3,2,3]
-; AVX1-NEXT:    vextractf128 $1, %ymm0, %xmm5
-; AVX1-NEXT:    vpaddq %xmm5, %xmm5, %xmm5
-; AVX1-NEXT:    vpsllq %xmm4, %xmm5, %xmm4
-; AVX1-NEXT:    vpsllq %xmm3, %xmm5, %xmm3
-; AVX1-NEXT:    vpblendw {{.*#+}} xmm3 = xmm3[0,1,2,3],xmm4[4,5,6,7]
-; AVX1-NEXT:    vshufps {{.*#+}} xmm4 = xmm2[2,3,2,3]
+; AVX1-NEXT:    vextractf128 $1, %ymm0, %xmm4
+; AVX1-NEXT:    vpaddq %xmm4, %xmm4, %xmm4
+; AVX1-NEXT:    vpsllq %xmm3, %xmm4, %xmm5
+; AVX1-NEXT:    vpshufd {{.*#+}} xmm3 = xmm3[2,3,2,3]
+; AVX1-NEXT:    vpsllq %xmm3, %xmm4, %xmm3
+; AVX1-NEXT:    vpblendw {{.*#+}} xmm3 = xmm5[0,1,2,3],xmm3[4,5,6,7]
 ; AVX1-NEXT:    vpaddq %xmm0, %xmm0, %xmm0
-; AVX1-NEXT:    vpsllq %xmm4, %xmm0, %xmm4
+; AVX1-NEXT:    vpsllq %xmm2, %xmm0, %xmm4
+; AVX1-NEXT:    vpshufd {{.*#+}} xmm2 = xmm2[2,3,2,3]
 ; AVX1-NEXT:    vpsllq %xmm2, %xmm0, %xmm0
-; AVX1-NEXT:    vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm4[4,5,6,7]
+; AVX1-NEXT:    vpblendw {{.*#+}} xmm0 = xmm4[0,1,2,3],xmm0[4,5,6,7]
 ; AVX1-NEXT:    vinsertf128 $1, %xmm3, %ymm0, %ymm0
 ; AVX1-NEXT:    vorps %ymm1, %ymm0, %ymm0
 ; AVX1-NEXT:    retq
@@ -127,23 +127,23 @@ define <4 x i64> @var_funnnel_v4i64(<4 x i64> %x, <4 x i64> %y, <4 x i64> %amt)
 ; XOPAVX1-LABEL: var_funnnel_v4i64:
 ; XOPAVX1:       # %bb.0:
 ; XOPAVX1-NEXT:    vbroadcastsd {{.*#+}} ymm3 = [63,63,63,63]
-; XOPAVX1-NEXT:    vandps %ymm3, %ymm2, %ymm4
+; XOPAVX1-NEXT:    vandnps %ymm3, %ymm2, %ymm4
 ; XOPAVX1-NEXT:    vextractf128 $1, %ymm4, %xmm5
-; XOPAVX1-NEXT:    vpxor %xmm6, %xmm6, %xmm6
-; XOPAVX1-NEXT:    vpsubq %xmm5, %xmm6, %xmm5
-; XOPAVX1-NEXT:    vextractf128 $1, %ymm1, %xmm7
-; XOPAVX1-NEXT:    vpshlq %xmm5, %xmm7, %xmm5
-; XOPAVX1-NEXT:    vpsubq %xmm4, %xmm6, %xmm4
-; XOPAVX1-NEXT:    vpshlq %xmm4, %xmm1, %xmm1
-; XOPAVX1-NEXT:    vinsertf128 $1, %xmm5, %ymm1, %ymm1
-; XOPAVX1-NEXT:    vandnps %ymm3, %ymm2, %ymm2
-; XOPAVX1-NEXT:    vextractf128 $1, %ymm2, %xmm3
-; XOPAVX1-NEXT:    vextractf128 $1, %ymm0, %xmm4
-; XOPAVX1-NEXT:    vpaddq %xmm4, %xmm4, %xmm4
-; XOPAVX1-NEXT:    vpshlq %xmm3, %xmm4, %xmm3
+; XOPAVX1-NEXT:    vextractf128 $1, %ymm0, %xmm6
+; XOPAVX1-NEXT:    vpaddq %xmm6, %xmm6, %xmm6
+; XOPAVX1-NEXT:    vpshlq %xmm5, %xmm6, %xmm5
 ; XOPAVX1-NEXT:    vpaddq %xmm0, %xmm0, %xmm0
-; XOPAVX1-NEXT:    vpshlq %xmm2, %xmm0, %xmm0
-; XOPAVX1-NEXT:    vinsertf128 $1, %xmm3, %ymm0, %ymm0
+; XOPAVX1-NEXT:    vpshlq %xmm4, %xmm0, %xmm0
+; XOPAVX1-NEXT:    vinsertf128 $1, %xmm5, %ymm0, %ymm0
+; XOPAVX1-NEXT:    vandps %ymm3, %ymm2, %ymm2
+; XOPAVX1-NEXT:    vextractf128 $1, %ymm2, %xmm3
+; XOPAVX1-NEXT:    vpxor %xmm4, %xmm4, %xmm4
+; XOPAVX1-NEXT:    vpsubq %xmm3, %xmm4, %xmm3
+; XOPAVX1-NEXT:    vextractf128 $1, %ymm1, %xmm5
+; XOPAVX1-NEXT:    vpshlq %xmm3, %xmm5, %xmm3
+; XOPAVX1-NEXT:    vpsubq %xmm2, %xmm4, %xmm2
+; XOPAVX1-NEXT:    vpshlq %xmm2, %xmm1, %xmm1
+; XOPAVX1-NEXT:    vinsertf128 $1, %xmm3, %ymm1, %ymm1
 ; XOPAVX1-NEXT:    vorps %ymm1, %ymm0, %ymm0
 ; XOPAVX1-NEXT:    retq
 ;
diff --git a/llvm/test/CodeGen/X86/vector-gep.ll b/llvm/test/CodeGen/X86/vector-gep.ll
index b4cffcd171b33..5c485592295d3 100644
--- a/llvm/test/CodeGen/X86/vector-gep.ll
+++ b/llvm/test/CodeGen/X86/vector-gep.ll
@@ -122,87 +122,91 @@ define <64 x ptr> @AGEP9(ptr %param, <64 x i32> %off) nounwind {
 ; CHECK-NEXT:    movl %esp, %ebp
 ; CHECK-NEXT:    andl $-32, %esp
 ; CHECK-NEXT:    subl $160, %esp
-; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm3
-; CHECK-NEXT:    vbroadcastss 12(%ebp), %xmm5
-; CHECK-NEXT:    vpaddd %xmm3, %xmm5, %xmm3
-; CHECK-NEXT:    vmovdqa %xmm3, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
+; CHECK-NEXT:    vmovdqa %ymm2, %ymm5
+; CHECK-NEXT:    vmovdqa %ymm1, %ymm3
+; CHECK-NEXT:    vmovdqa %ymm0, %ymm1
+; CHECK-NEXT:    vmovdqa 72(%ebp), %ymm0
+; CHECK-NEXT:    vmovdqa 40(%ebp), %ymm2
+; CHECK-NEXT:    vpaddd %xmm2, %xmm2, %xmm4
+; CHECK-NEXT:    vbroadcastss 12(%ebp), %xmm7
+; CHECK-NEXT:    vpaddd %xmm4, %xmm7, %xmm4
+; CHECK-NEXT:    vmovdqa %xmm4, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
+; CHECK-NEXT:    vextractf128 $1, %ymm2, %xmm2
+; CHECK-NEXT:    vpaddd %xmm2, %xmm2, %xmm2
+; CHECK-NEXT:    vpaddd %xmm2, %xmm7, %xmm2
+; CHECK-NEXT:    vmovdqa %xmm2, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
+; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm2
+; CHECK-NEXT:    vpaddd %xmm2, %xmm7, %xmm2
+; CHECK-NEXT:    vmovdqa %xmm2, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
 ; CHECK-NEXT:    vextractf128 $1, %ymm0, %xmm0
 ; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
+; CHECK-NEXT:    vpaddd %xmm0, %xmm7, %xmm0
 ; CHECK-NEXT:    vmovdqa %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
-; CHECK-NEXT:    vpaddd %xmm1, %xmm1, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
-; CHECK-NEXT:    vmovdqa %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
-; CHECK-NEXT:    vextractf128 $1, %ymm1, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
-; CHECK-NEXT:    vmovdqa %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
-; CHECK-NEXT:    vpaddd %xmm2, %xmm2, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
-; CHECK-NEXT:    vmovdqa %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
-; CHECK-NEXT:    vextractf128 $1, %ymm2, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
-; CHECK-NEXT:    vmovdqa %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
-; CHECK-NEXT:    vmovdqa 40(%ebp), %xmm0
+; CHECK-NEXT:    vmovdqa 104(%ebp), %ymm0
+; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm2
+; CHECK-NEXT:    vpaddd %xmm2, %xmm7, %xmm2
+; CHECK-NEXT:    vmovdqa %xmm2, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
+; CHECK-NEXT:    vextractf128 $1, %ymm0, %xmm0
 ; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
+; CHECK-NEXT:    vpaddd %xmm0, %xmm7, %xmm0
 ; CHECK-NEXT:    vmovdqa %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
-; CHECK-NEXT:    vmovdqa 56(%ebp), %xmm0
+; CHECK-NEXT:    vmovdqa 136(%ebp), %ymm0
+; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm2
+; CHECK-NEXT:    vpaddd %xmm2, %xmm7, %xmm2
+; CHECK-NEXT:    vmovdqa %xmm2, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
+; CHECK-NEXT:    vextractf128 $1, %ymm0, %xmm0
 ; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
+; CHECK-NEXT:    vpaddd %xmm0, %xmm7, %xmm0
 ; CHECK-NEXT:    vmovdqa %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
-; CHECK-NEXT:    vmovdqa 72(%ebp), %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
-; CHECK-NEXT:    vmovdqa %xmm0, (%esp) # 16-byte Spill
-; CHECK-NEXT:    vmovdqa 88(%ebp), %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm2
-; CHECK-NEXT:    vmovdqa 104(%ebp), %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm1
-; CHECK-NEXT:    vmovdqa 120(%ebp), %xmm0
+; CHECK-NEXT:    vmovdqa 168(%ebp), %ymm0
+; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm2
+; CHECK-NEXT:    vpaddd %xmm2, %xmm7, %xmm2
+; CHECK-NEXT:    vmovdqa %xmm2, (%esp) # 16-byte Spill
+; CHECK-NEXT:    vextractf128 $1, %ymm0, %xmm0
 ; CHECK-NEXT:    vpaddd %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:    vpaddd %xmm0, %xmm5, %xmm0
-; CHECK-NEXT:    vmovdqa 136(%ebp), %xmm6
-; CHECK-NEXT:    vpaddd %xmm6, %xmm6, %xmm6
-; CHECK-NEXT:    vpaddd %xmm6, %xmm5, %xmm6
-; CHECK-NEXT:    vmovdqa 152(%ebp), %xmm7
-; CHECK-NEXT:    vpaddd %xmm7, %xmm7, %xmm7
-; CHECK-NEXT:    vpaddd %xmm7, %xmm5, %xmm7
-; CHECK-NEXT:    vmovdqa 168(%ebp), %xmm4
-; CHECK-NEXT:    vpaddd %xmm4, %xmm4, %xmm4
-; CHECK-NEXT:    vpaddd %xmm4, %xmm5, %xmm4
-; CHECK-NEXT:    vmovdqa 184(%ebp), %xmm3
+; CHECK-NEXT:    vpaddd %xmm0, %xmm7, %xmm2
+; CHECK-NEXT:    vpa...
[truncated]

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is correct, this properly is asserted in SelectionDAG::getNode. I can't comment on the x86 diffs.

Similar to INSERT_SUBVECTOR - the index is constant and will be inbounds
@RKSimon RKSimon force-pushed the dag-nopoison-extract-subvector branch from 17f8520 to 9e47a51 Compare March 24, 2025 14:54
@RKSimon RKSimon merged commit 0237216 into llvm:main Mar 24, 2025
11 checks passed
@RKSimon RKSimon deleted the dag-nopoison-extract-subvector branch March 24, 2025 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants