Skip to content

[CostModel][X86] merge vector shuffle costs tests using -cost-kind=all #131819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 18, 2025

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Mar 18, 2025

No description provided.

@llvmbot llvmbot added backend:X86 llvm:analysis Includes value tracking, cost tables and constant folding labels Mar 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 18, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-analysis

Author: Simon Pilgrim (RKSimon)

Changes

Patch is 19.00 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131819.diff

88 Files Affected:

  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-broadcast-codesize.ll (-466)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-broadcast-fp16-codesize.ll (-19)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-broadcast-fp16-latency.ll (-19)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-broadcast-fp16-sizelatency.ll (-19)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-broadcast-fp16.ll (+7-7)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-broadcast-latency.ll (-466)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-broadcast-sizelatency.ll (-466)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-broadcast.ll (+278-278)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector-codesize.ll (-219)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector-latency.ll (-219)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector-sizelatency.ll (-219)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-concat_subvector.ll (+122-122)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector-codesize.ll (-2193)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector-latency.ll (-2193)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector-sizelatency.ll (-2193)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll (+1894-1894)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-codesize.ll (-1158)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-latency.ll (-1158)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-sizelatency.ll (-1158)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector.ll (+903-903)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-load-codesize.ll (-473)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-load-latency.ll (-473)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-load-sizelatency.ll (-473)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-load.ll (+372-372)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-non-pow-2-codesize.ll (-31)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-non-pow-2-latency.ll (-31)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-non-pow-2-sizelatency.ll (-31)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-non-pow-2.ll (+10-10)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i1-codesize.ll (-1255)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i1-latency.ll (-1255)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i1-sizelatency.ll (-1255)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-replication-i1.ll (+960-960)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i16-codesize.ll (-789)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i16-latency.ll (-789)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i16-sizelatency.ll (-789)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-replication-i16.ll (+571-571)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i32-codesize.ll (-521)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i32-latency.ll (-521)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i32-sizelatency.ll (-521)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-replication-i32.ll (+345-345)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i64-codesize.ll (-458)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i64-latency.ll (-458)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i64-sizelatency.ll (-458)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-replication-i64.ll (+289-289)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i8-codesize.ll (-789)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i8-latency.ll (-789)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-replication-i8-sizelatency.ll (-789)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-replication-i8.ll (+571-571)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-reverse-codesize.ll (-346)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-reverse-fp16-codesize.ll (-19)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-reverse-fp16-latency.ll (-19)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-reverse-fp16-sizelatency.ll (-19)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-reverse-fp16.ll (+7-7)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-reverse-latency.ll (-346)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-reverse-sizelatency.ll (-346)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-reverse.ll (+208-208)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-select-codesize.ll (-350)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-select-latency.ll (-350)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-select-sizelatency.ll (-350)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-select.ll (+223-223)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-single-src-codesize.ll (-389)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-single-src-fp16-codesize.ll (-17)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-single-src-fp16-latency.ll (-17)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-single-src-fp16-sizelatency.ll (-17)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-single-src-fp16.ll (+6-6)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-single-src-latency.ll (-389)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-single-src-sizelatency.ll (-389)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-single-src.ll (+243-243)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-splat-codesize.ll (-466)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-splat-latency.ll (-466)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-splat-sizelatency.ll (-466)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-splat.ll (+278-278)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-splice-codesize.ll (-323)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-splice-latency.ll (-323)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-splice-sizelatency.ll (-323)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-splice.ll (+200-200)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-transpose-codesize.ll (-281)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-transpose-latency.ll (-281)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-transpose-sizelatency.ll (-281)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-transpose.ll (+169-169)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-two-src-codesize.ll (-416)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-two-src-fp16-codesize.ll (-17)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-two-src-fp16-latency.ll (-17)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-two-src-fp16-sizelatency.ll (-17)
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-two-src-fp16.ll (+6-6)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-two-src-latency.ll (-416)
  • (removed) llvm/test/Analysis/CostModel/X86/shuffle-two-src-sizelatency.ll ()
  • (modified) llvm/test/Analysis/CostModel/X86/shuffle-two-src.ll ()
diff --git a/llvm/test/Analysis/CostModel/X86/shuffle-broadcast-codesize.ll b/llvm/test/Analysis/CostModel/X86/shuffle-broadcast-codesize.ll
deleted file mode 100644
index a149ec45c863e..0000000000000
--- a/llvm/test/Analysis/CostModel/X86/shuffle-broadcast-codesize.ll
+++ /dev/null
@@ -1,466 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+sse2 | FileCheck %s -check-prefixes=SSE,SSE2
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+ssse3 | FileCheck %s -check-prefixes=SSE,SSSE3
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+sse4.2 | FileCheck %s -check-prefixes=SSE,SSE42
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+avx | FileCheck %s -check-prefixes=AVX1
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+avx2 | FileCheck %s -check-prefixes=AVX2
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+avx512f | FileCheck %s --check-prefixes=AVX512
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=AVX512
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mattr=+avx512f,+avx512bw,+avx512vbmi | FileCheck %s --check-prefixes=AVX512
-;
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mcpu=slm | FileCheck %s --check-prefixes=SSE,SSE42
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mcpu=goldmont | FileCheck %s --check-prefixes=SSE,SSE42
-; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mcpu=btver2 | FileCheck %s --check-prefixes=AVX1
-
-;
-; Verify the cost model for broadcast shuffles.
-;
-
-define void @test_vXf64(<2 x double> %src128, <4 x double> %src256, <8 x double> %src512) {
-; SSE-LABEL: 'test_vXf64'
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x double> %src128, <2 x double> undef, <2 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <4 x double> %src256, <4 x double> undef, <4 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <8 x double> %src512, <8 x double> undef, <8 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX1-LABEL: 'test_vXf64'
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x double> %src128, <2 x double> undef, <2 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V256 = shufflevector <4 x double> %src256, <4 x double> undef, <4 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V512 = shufflevector <8 x double> %src512, <8 x double> undef, <8 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX2-LABEL: 'test_vXf64'
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x double> %src128, <2 x double> undef, <2 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <4 x double> %src256, <4 x double> undef, <4 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <8 x double> %src512, <8 x double> undef, <8 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX512-LABEL: 'test_vXf64'
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x double> %src128, <2 x double> undef, <2 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <4 x double> %src256, <4 x double> undef, <4 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <8 x double> %src512, <8 x double> undef, <8 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-  %V128 = shufflevector <2 x double> %src128, <2 x double> undef, <2 x i32> zeroinitializer
-  %V256 = shufflevector <4 x double> %src256, <4 x double> undef, <4 x i32> zeroinitializer
-  %V512 = shufflevector <8 x double> %src512, <8 x double> undef, <8 x i32> zeroinitializer
-  ret void
-}
-
-define void @test_vXi64(<2 x i64> %src128, <4 x i64> %src256, <8 x i64> %src512) {
-; SSE-LABEL: 'test_vXi64'
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x i64> %src128, <2 x i64> undef, <2 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <4 x i64> %src256, <4 x i64> undef, <4 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <8 x i64> %src512, <8 x i64> undef, <8 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX1-LABEL: 'test_vXi64'
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x i64> %src128, <2 x i64> undef, <2 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V256 = shufflevector <4 x i64> %src256, <4 x i64> undef, <4 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V512 = shufflevector <8 x i64> %src512, <8 x i64> undef, <8 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX2-LABEL: 'test_vXi64'
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x i64> %src128, <2 x i64> undef, <2 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <4 x i64> %src256, <4 x i64> undef, <4 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <8 x i64> %src512, <8 x i64> undef, <8 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX512-LABEL: 'test_vXi64'
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <2 x i64> %src128, <2 x i64> undef, <2 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <4 x i64> %src256, <4 x i64> undef, <4 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <8 x i64> %src512, <8 x i64> undef, <8 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-  %V128 = shufflevector <2 x i64> %src128, <2 x i64> undef, <2 x i32> zeroinitializer
-  %V256 = shufflevector <4 x i64> %src256, <4 x i64> undef, <4 x i32> zeroinitializer
-  %V512 = shufflevector <8 x i64> %src512, <8 x i64> undef, <8 x i32> zeroinitializer
-  ret void
-}
-
-define void @test_vXf32(<2 x float> %src64, <4 x float> %src128, <8 x float> %src256, <16 x float> %src512) {
-; SSE-LABEL: 'test_vXf32'
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x float> %src64, <2 x float> undef, <2 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x float> %src128, <4 x float> undef, <4 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <8 x float> %src256, <8 x float> undef, <8 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <16 x float> %src512, <16 x float> undef, <16 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX1-LABEL: 'test_vXf32'
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x float> %src64, <2 x float> undef, <2 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x float> %src128, <4 x float> undef, <4 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V256 = shufflevector <8 x float> %src256, <8 x float> undef, <8 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V512 = shufflevector <16 x float> %src512, <16 x float> undef, <16 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX2-LABEL: 'test_vXf32'
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x float> %src64, <2 x float> undef, <2 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x float> %src128, <4 x float> undef, <4 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <8 x float> %src256, <8 x float> undef, <8 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <16 x float> %src512, <16 x float> undef, <16 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX512-LABEL: 'test_vXf32'
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x float> %src64, <2 x float> undef, <2 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x float> %src128, <4 x float> undef, <4 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <8 x float> %src256, <8 x float> undef, <8 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <16 x float> %src512, <16 x float> undef, <16 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-  %V64 = shufflevector <2 x float> %src64, <2 x float> undef, <2 x i32> zeroinitializer
-  %V128 = shufflevector <4 x float> %src128, <4 x float> undef, <4 x i32> zeroinitializer
-  %V256 = shufflevector <8 x float> %src256, <8 x float> undef, <8 x i32> zeroinitializer
-  %V512 = shufflevector <16 x float> %src512, <16 x float> undef, <16 x i32> zeroinitializer
-  ret void
-}
-
-define void @test_vXi32(<2 x i32> %src64, <4 x i32> %src128, <8 x i32> %src256, <16 x i32> %src512) {
-; SSE-LABEL: 'test_vXi32'
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x i32> %src64, <2 x i32> undef, <2 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x i32> %src128, <4 x i32> undef, <4 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <8 x i32> %src256, <8 x i32> undef, <8 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <16 x i32> %src512, <16 x i32> undef, <16 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX1-LABEL: 'test_vXi32'
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x i32> %src64, <2 x i32> undef, <2 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x i32> %src128, <4 x i32> undef, <4 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V256 = shufflevector <8 x i32> %src256, <8 x i32> undef, <8 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V512 = shufflevector <16 x i32> %src512, <16 x i32> undef, <16 x i32> zeroinitializer
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX2-LABEL: 'test_vXi32'
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x i32> %src64, <2 x i32> undef, <2 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x i32> %src128, <4 x i32> undef, <4 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <8 x i32> %src256, <8 x i32> undef, <8 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <16 x i32> %src512, <16 x i32> undef, <16 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX512-LABEL: 'test_vXi32'
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <2 x i32> %src64, <2 x i32> undef, <2 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <4 x i32> %src128, <4 x i32> undef, <4 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <8 x i32> %src256, <8 x i32> undef, <8 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <16 x i32> %src512, <16 x i32> undef, <16 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-  %V64 = shufflevector <2 x i32> %src64, <2 x i32> undef, <2 x i32> zeroinitializer
-  %V128 = shufflevector <4 x i32> %src128, <4 x i32> undef, <4 x i32> zeroinitializer
-  %V256 = shufflevector <8 x i32> %src256, <8 x i32> undef, <8 x i32> zeroinitializer
-  %V512 = shufflevector <16 x i32> %src512, <16 x i32> undef, <16 x i32> zeroinitializer
-  ret void
-}
-
-define void @test_vXf16(<2 x half> %src32, <4 x half> %src64, <8 x half> %src128, <16 x half> %src256, <32 x half> %src512) {
-; SSE2-LABEL: 'test_vXf16'
-; SSE2-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V32 = shufflevector <2 x half> %src32, <2 x half> undef, <2 x i32> zeroinitializer
-; SSE2-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V64 = shufflevector <4 x half> %src64, <4 x half> undef, <4 x i32> zeroinitializer
-; SSE2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V128 = shufflevector <8 x half> %src128, <8 x half> undef, <8 x i32> zeroinitializer
-; SSE2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V256 = shufflevector <16 x half> %src256, <16 x half> undef, <16 x i32> zeroinitializer
-; SSE2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V512 = shufflevector <32 x half> %src512, <32 x half> undef, <32 x i32> zeroinitializer
-; SSE2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; SSSE3-LABEL: 'test_vXf16'
-; SSSE3-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V32 = shufflevector <2 x half> %src32, <2 x half> undef, <2 x i32> zeroinitializer
-; SSSE3-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V64 = shufflevector <4 x half> %src64, <4 x half> undef, <4 x i32> zeroinitializer
-; SSSE3-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <8 x half> %src128, <8 x half> undef, <8 x i32> zeroinitializer
-; SSSE3-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <16 x half> %src256, <16 x half> undef, <16 x i32> zeroinitializer
-; SSSE3-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <32 x half> %src512, <32 x half> undef, <32 x i32> zeroinitializer
-; SSSE3-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; SSE42-LABEL: 'test_vXf16'
-; SSE42-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V32 = shufflevector <2 x half> %src32, <2 x half> undef, <2 x i32> zeroinitializer
-; SSE42-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V64 = shufflevector <4 x half> %src64, <4 x half> undef, <4 x i32> zeroinitializer
-; SSE42-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <8 x half> %src128, <8 x half> undef, <8 x i32> zeroinitializer
-; SSE42-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <16 x half> %src256, <16 x half> undef, <16 x i32> zeroinitializer
-; SSE42-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <32 x half> %src512, <32 x half> undef, <32 x i32> zeroinitializer
-; SSE42-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX2-LABEL: 'test_vXf16'
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V32 = shufflevector <2 x half> %src32, <2 x half> undef, <2 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V64 = shufflevector <4 x half> %src64, <4 x half> undef, <4 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <8 x half> %src128, <8 x half> undef, <8 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <16 x half> %src256, <16 x half> undef, <16 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <32 x half> %src512, <32 x half> undef, <32 x i32> zeroinitializer
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-; AVX512-LABEL: 'test_vXf16'
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V32 = shufflevector <2 x half> %src32, <2 x half> undef, <2 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V64 = shufflevector <4 x half> %src64, <4 x half> undef, <4 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V128 = shufflevector <8 x half> %src128, <8 x half> undef, <8 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V256 = shufflevector <16 x half> %src256, <16 x half> undef, <16 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V512 = shufflevector <32 x half> %src512, <32 x half> undef, <32 x i32> zeroinitializer
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
-;
-  %V32  = shufflevector <2 x half> %src32, <2 x half> undef, <2 x i32> zeroinitializer
-  %V64  = shufflevector <4 x half> %src64, <4 x half> undef, <4 x i32> zeroinitializer
-  %V128 = shufflevector <8 x half> %src128, <8 x half> undef, <8 x i32> zeroinitializer
-  %V256 = shufflevector <16 x half> %src256, <16 x half> undef, <16 x i32> zeroinitializer
-  %V512 = shufflevector <32 x half> %src512, <32 x half> undef, <32 x i32> zeroinitializer
-  ret void
-}
-
-define void @test_vXbf16(<2 x bfloat> %src32, <4 x bfloat> %src64, <8 x bfloat> %src128, <16 x bfloat> %src256, <32 x bfloat> %src512) {
-; SSE-LABEL: 'test_vXbf16'
-; SSE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V32 = shufflevector <2 x bfloat> %src32, <2 x bfloat> undef, <2 x i32> zeroinitializer
-; SSE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V64 = shufflevector <4 x bfloat> %sr...
[truncated]

@RKSimon RKSimon merged commit 33e5d01 into llvm:main Mar 18, 2025
11 of 13 checks passed
@RKSimon RKSimon deleted the x86-shuffle-costs-all branch March 18, 2025 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants