Skip to content

[RISCV] Disable i1 fixed vectors with more than 1024 elements. #133267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 28, 2025

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Mar 27, 2025

v2048i1 is an MVT, but v2048i8 is not so we don't support i8 vectors with more than 1024 elements. Lowering a v2048i1 shufflevector would requires promoting to v2048i8. Since v2048i8 isn't legal and isn't an MVT this leads to a crash.

To fix the crash, this patch makes v2048i1 an illegal type.

v2048i1 is an MVT, but v2048i8 is not so we don't support i8 vectors
with more than 1024 elements. Lowering a v2048i1 shufflevector
would requires promoting to v2048i8. Since v2048i8 isn't legal and
isn't an MVT this leads to a crash.

To fix the crash, this patch makes v2048i1 an illegal type.
@llvmbot
Copy link
Member

llvmbot commented Mar 27, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

v2048i1 is an MVT, but v2048i8 is not so we don't support i8 vectors with more than 1024 elements. Lowering a v2048i1 shufflevector would requires promoting to v2048i8. Since v2048i8 isn't legal and isn't an MVT this leads to a crash.

To fix the crash, this patch makes v2048i1 an illegal type.


Patch is 706.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133267.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+1-1)
  • (added) llvm/test/CodeGen/RISCV/rvv/pr133217.ll (+19477)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 5b5dca4b541df..97186496a97f4 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -2631,7 +2631,7 @@ static bool useRVVForFixedLengthVectorVT(MVT VT,
   // across all supported vector element types to avoid legalization issues.
   // Therefore -- since the largest is v1024i8/v512i16/etc -- the largest
   // fixed-length vector type we support is 1024 bytes.
-  if (VT.getFixedSizeInBits() > 1024 * 8)
+  if (VT.getVectorNumElements() > 1024 || VT.getFixedSizeInBits() > 1024 * 8)
     return false;
 
   unsigned MinVLen = Subtarget.getRealMinVLen();
diff --git a/llvm/test/CodeGen/RISCV/rvv/pr133217.ll b/llvm/test/CodeGen/RISCV/rvv/pr133217.ll
new file mode 100644
index 0000000000000..dd81a84135069
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/pr133217.ll
@@ -0,0 +1,19477 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=riscv64 -mattr=+v,+zvl2048b | FileCheck %s
+
+define <2048 x i1> @foo(<1024 x i1> %x, <1024 x i1> %y) {
+; CHECK-LABEL: foo:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    addi sp, sp, -2032
+; CHECK-NEXT:    .cfi_def_cfa_offset 2032
+; CHECK-NEXT:    sd ra, 2024(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s0, 2016(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s2, 2008(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s3, 2000(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s4, 1992(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s5, 1984(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s6, 1976(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s7, 1968(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s8, 1960(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s9, 1952(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s10, 1944(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    sd s11, 1936(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_offset ra, -8
+; CHECK-NEXT:    .cfi_offset s0, -16
+; CHECK-NEXT:    .cfi_offset s2, -24
+; CHECK-NEXT:    .cfi_offset s3, -32
+; CHECK-NEXT:    .cfi_offset s4, -40
+; CHECK-NEXT:    .cfi_offset s5, -48
+; CHECK-NEXT:    .cfi_offset s6, -56
+; CHECK-NEXT:    .cfi_offset s7, -64
+; CHECK-NEXT:    .cfi_offset s8, -72
+; CHECK-NEXT:    .cfi_offset s9, -80
+; CHECK-NEXT:    .cfi_offset s10, -88
+; CHECK-NEXT:    .cfi_offset s11, -96
+; CHECK-NEXT:    addi s0, sp, 2032
+; CHECK-NEXT:    .cfi_def_cfa s0, 0
+; CHECK-NEXT:    lui a0, 2
+; CHECK-NEXT:    addiw a0, a0, 1040
+; CHECK-NEXT:    sub sp, sp, a0
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    slli a0, a0, 1
+; CHECK-NEXT:    mv a1, a0
+; CHECK-NEXT:    slli a0, a0, 2
+; CHECK-NEXT:    add a1, a1, a0
+; CHECK-NEXT:    slli a0, a0, 2
+; CHECK-NEXT:    add a1, a1, a0
+; CHECK-NEXT:    slli a0, a0, 1
+; CHECK-NEXT:    add a1, a1, a0
+; CHECK-NEXT:    slli a0, a0, 2
+; CHECK-NEXT:    add a0, a0, a1
+; CHECK-NEXT:    sub sp, sp, a0
+; CHECK-NEXT:    andi sp, sp, -1024
+; CHECK-NEXT:    li s2, 1024
+; CHECK-NEXT:    lui a0, 2
+; CHECK-NEXT:    add a5, sp, a0
+; CHECK-NEXT:    li a0, 3
+; CHECK-NEXT:    slli a0, a0, 11
+; CHECK-NEXT:    add t2, sp, a0
+; CHECK-NEXT:    li s11, 127
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 2040
+; CHECK-NEXT:    add a7, sp, a0
+; CHECK-NEXT:    li s10, 126
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 2032
+; CHECK-NEXT:    add t3, sp, a0
+; CHECK-NEXT:    li s9, 125
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 2024
+; CHECK-NEXT:    add a2, sp, a0
+; CHECK-NEXT:    li s8, 124
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 2016
+; CHECK-NEXT:    add t1, sp, a0
+; CHECK-NEXT:    li s7, 123
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 2008
+; CHECK-NEXT:    add a3, sp, a0
+; CHECK-NEXT:    li s6, 122
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 2000
+; CHECK-NEXT:    add t0, sp, a0
+; CHECK-NEXT:    li s5, 121
+; CHECK-NEXT:    li s4, 120
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1984
+; CHECK-NEXT:    add a4, sp, a0
+; CHECK-NEXT:    li s3, 119
+; CHECK-NEXT:    li t6, 118
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1968
+; CHECK-NEXT:    add a6, sp, a0
+; CHECK-NEXT:    li t5, 117
+; CHECK-NEXT:    li t4, 116
+; CHECK-NEXT:    vsetvli zero, s2, e8, m4, ta, ma
+; CHECK-NEXT:    vmv.v.i v16, 0
+; CHECK-NEXT:    vmerge.vim v12, v16, 1, v0
+; CHECK-NEXT:    vse8.v v12, (a5)
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1952
+; CHECK-NEXT:    add a1, sp, a0
+; CHECK-NEXT:    vmv1r.v v0, v8
+; CHECK-NEXT:    vmerge.vim v8, v16, 1, v0
+; CHECK-NEXT:    vse8.v v8, (t2)
+; CHECK-NEXT:    li a0, 115
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vslidedown.vx v10, v12, s11
+; CHECK-NEXT:    vslidedown.vx v11, v8, s11
+; CHECK-NEXT:    lui a5, 1
+; CHECK-NEXT:    addiw a5, a5, 1944
+; CHECK-NEXT:    add a5, sp, a5
+; CHECK-NEXT:    vslidedown.vx v25, v12, s10
+; CHECK-NEXT:    vslidedown.vx v14, v8, s10
+; CHECK-NEXT:    li s10, 114
+; CHECK-NEXT:    vslidedown.vx v26, v12, s9
+; CHECK-NEXT:    vslidedown.vx v15, v8, s9
+; CHECK-NEXT:    lui t2, 1
+; CHECK-NEXT:    addiw t2, t2, 1936
+; CHECK-NEXT:    add s2, sp, t2
+; CHECK-NEXT:    vslidedown.vx v27, v12, s8
+; CHECK-NEXT:    vslidedown.vx v16, v8, s8
+; CHECK-NEXT:    li s8, 113
+; CHECK-NEXT:    vslidedown.vx v28, v12, s7
+; CHECK-NEXT:    vslidedown.vx v17, v8, s7
+; CHECK-NEXT:    lui t2, 1
+; CHECK-NEXT:    addiw t2, t2, 1928
+; CHECK-NEXT:    add s7, sp, t2
+; CHECK-NEXT:    vslidedown.vx v29, v12, s6
+; CHECK-NEXT:    vslidedown.vx v18, v8, s6
+; CHECK-NEXT:    li s6, 112
+; CHECK-NEXT:    vslidedown.vx v30, v12, s5
+; CHECK-NEXT:    vslidedown.vx v19, v8, s5
+; CHECK-NEXT:    lui t2, 1
+; CHECK-NEXT:    addiw t2, t2, 1920
+; CHECK-NEXT:    add t2, sp, t2
+; CHECK-NEXT:    vslidedown.vx v31, v12, s4
+; CHECK-NEXT:    vslidedown.vx v20, v8, s4
+; CHECK-NEXT:    li s4, 111
+; CHECK-NEXT:    vslidedown.vx v7, v12, s3
+; CHECK-NEXT:    vslidedown.vx v21, v8, s3
+; CHECK-NEXT:    vslidedown.vx v6, v12, t6
+; CHECK-NEXT:    vslidedown.vx v22, v8, t6
+; CHECK-NEXT:    li s11, 110
+; CHECK-NEXT:    vslidedown.vx v5, v12, t5
+; CHECK-NEXT:    vslidedown.vx v23, v8, t5
+; CHECK-NEXT:    lui t5, 1
+; CHECK-NEXT:    addiw t5, t5, 1904
+; CHECK-NEXT:    add s5, sp, t5
+; CHECK-NEXT:    vslidedown.vx v4, v12, t4
+; CHECK-NEXT:    vslidedown.vx v24, v8, t4
+; CHECK-NEXT:    li ra, 109
+; CHECK-NEXT:    vse8.v v10, (a7)
+; CHECK-NEXT:    lui a7, 1
+; CHECK-NEXT:    addiw a7, a7, 1896
+; CHECK-NEXT:    add a7, sp, a7
+; CHECK-NEXT:    vse8.v v25, (t3)
+; CHECK-NEXT:    li t6, 108
+; CHECK-NEXT:    vslidedown.vx v3, v12, a0
+; CHECK-NEXT:    vslidedown.vx v10, v8, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1888
+; CHECK-NEXT:    add t3, sp, a0
+; CHECK-NEXT:    vse8.v v26, (a2)
+; CHECK-NEXT:    li t4, 107
+; CHECK-NEXT:    vse8.v v27, (t1)
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1880
+; CHECK-NEXT:    add a2, sp, a0
+; CHECK-NEXT:    vslidedown.vx v2, v12, s10
+; CHECK-NEXT:    vslidedown.vx v25, v8, s10
+; CHECK-NEXT:    li t1, 106
+; CHECK-NEXT:    vse8.v v28, (a3)
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1872
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vse8.v v29, (t0)
+; CHECK-NEXT:    vslidedown.vx v1, v12, s8
+; CHECK-NEXT:    vslidedown.vx v26, v8, s8
+; CHECK-NEXT:    lui a3, 1
+; CHECK-NEXT:    addiw a3, a3, 2041
+; CHECK-NEXT:    add t0, sp, a3
+; CHECK-NEXT:    lui a3, 1
+; CHECK-NEXT:    addiw a3, a3, 1992
+; CHECK-NEXT:    add a3, sp, a3
+; CHECK-NEXT:    vse8.v v30, (a3)
+; CHECK-NEXT:    vse8.v v31, (a4)
+; CHECK-NEXT:    lui a3, 1
+; CHECK-NEXT:    addiw a3, a3, 2033
+; CHECK-NEXT:    add s9, sp, a3
+; CHECK-NEXT:    vslidedown.vx v31, v12, s6
+; CHECK-NEXT:    vslidedown.vx v27, v8, s6
+; CHECK-NEXT:    li t5, 381
+; CHECK-NEXT:    lui a3, 1
+; CHECK-NEXT:    addiw a3, a3, 1976
+; CHECK-NEXT:    add a3, sp, a3
+; CHECK-NEXT:    vse8.v v7, (a3)
+; CHECK-NEXT:    lui a3, 1
+; CHECK-NEXT:    addiw a3, a3, 2025
+; CHECK-NEXT:    add a3, sp, a3
+; CHECK-NEXT:    vse8.v v6, (a6)
+; CHECK-NEXT:    li s3, 380
+; CHECK-NEXT:    vslidedown.vx v7, v12, s4
+; CHECK-NEXT:    vslidedown.vx v28, v8, s4
+; CHECK-NEXT:    lui a4, 1
+; CHECK-NEXT:    addiw a4, a4, 2017
+; CHECK-NEXT:    add a6, sp, a4
+; CHECK-NEXT:    lui a4, 1
+; CHECK-NEXT:    addiw a4, a4, 1960
+; CHECK-NEXT:    add a4, sp, a4
+; CHECK-NEXT:    vse8.v v5, (a4)
+; CHECK-NEXT:    li a4, 379
+; CHECK-NEXT:    vse8.v v4, (a1)
+; CHECK-NEXT:    lui a1, 1
+; CHECK-NEXT:    addiw a1, a1, 2009
+; CHECK-NEXT:    add a1, sp, a1
+; CHECK-NEXT:    vslidedown.vx v6, v12, s11
+; CHECK-NEXT:    vslidedown.vx v29, v8, s11
+; CHECK-NEXT:    li s10, 378
+; CHECK-NEXT:    vse8.v v3, (a5)
+; CHECK-NEXT:    lui a5, 1
+; CHECK-NEXT:    addiw a5, a5, 2001
+; CHECK-NEXT:    add a5, sp, a5
+; CHECK-NEXT:    vse8.v v2, (s2)
+; CHECK-NEXT:    li s8, 377
+; CHECK-NEXT:    vslidedown.vx v5, v12, ra
+; CHECK-NEXT:    vslidedown.vx v30, v8, ra
+; CHECK-NEXT:    lui s2, 1
+; CHECK-NEXT:    addiw s2, s2, 1993
+; CHECK-NEXT:    add s2, sp, s2
+; CHECK-NEXT:    vse8.v v1, (s7)
+; CHECK-NEXT:    li s7, 376
+; CHECK-NEXT:    vse8.v v31, (t2)
+; CHECK-NEXT:    lui t2, 1
+; CHECK-NEXT:    addiw t2, t2, 1985
+; CHECK-NEXT:    add t2, sp, t2
+; CHECK-NEXT:    vslidedown.vx v4, v12, t6
+; CHECK-NEXT:    vslidedown.vx v31, v8, t6
+; CHECK-NEXT:    li s6, 375
+; CHECK-NEXT:    lui t6, 1
+; CHECK-NEXT:    addiw t6, t6, 1912
+; CHECK-NEXT:    add t6, sp, t6
+; CHECK-NEXT:    vse8.v v7, (t6)
+; CHECK-NEXT:    lui t6, 1
+; CHECK-NEXT:    addiw t6, t6, 1977
+; CHECK-NEXT:    add t6, sp, t6
+; CHECK-NEXT:    vse8.v v6, (s5)
+; CHECK-NEXT:    li s5, 374
+; CHECK-NEXT:    vslidedown.vx v3, v12, t4
+; CHECK-NEXT:    vslidedown.vx v7, v8, t4
+; CHECK-NEXT:    lui t4, 1
+; CHECK-NEXT:    addiw t4, t4, 1969
+; CHECK-NEXT:    add t4, sp, t4
+; CHECK-NEXT:    vse8.v v5, (a7)
+; CHECK-NEXT:    li s4, 373
+; CHECK-NEXT:    vse8.v v4, (t3)
+; CHECK-NEXT:    lui a7, 1
+; CHECK-NEXT:    addiw a7, a7, 1961
+; CHECK-NEXT:    add a7, sp, a7
+; CHECK-NEXT:    vslidedown.vx v5, v12, t1
+; CHECK-NEXT:    vslidedown.vx v6, v8, t1
+; CHECK-NEXT:    li t3, 372
+; CHECK-NEXT:    vse8.v v3, (a2)
+; CHECK-NEXT:    lui a2, 1
+; CHECK-NEXT:    addiw a2, a2, 1953
+; CHECK-NEXT:    add a2, sp, a2
+; CHECK-NEXT:    vse8.v v5, (a0)
+; CHECK-NEXT:    li t1, 371
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    li a0, 383
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (t0)
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1945
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    li t0, 382
+; CHECK-NEXT:    vslidedown.vx v4, v12, t0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (s9)
+; CHECK-NEXT:    li t0, 370
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, t5
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a3)
+; CHECK-NEXT:    lui a3, 1
+; CHECK-NEXT:    addiw a3, a3, 1937
+; CHECK-NEXT:    add a3, sp, a3
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s3
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a6)
+; CHECK-NEXT:    li t5, 369
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a4
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a1)
+; CHECK-NEXT:    lui a1, 1
+; CHECK-NEXT:    addiw a1, a1, 1929
+; CHECK-NEXT:    add a1, sp, a1
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s10
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a5)
+; CHECK-NEXT:    li s3, 368
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s8
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (s2)
+; CHECK-NEXT:    lui a4, 1
+; CHECK-NEXT:    addiw a4, a4, 1921
+; CHECK-NEXT:    add a4, sp, a4
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s7
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (t2)
+; CHECK-NEXT:    li t2, 367
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s6
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (t6)
+; CHECK-NEXT:    lui a5, 1
+; CHECK-NEXT:    addiw a5, a5, 1913
+; CHECK-NEXT:    add a5, sp, a5
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s5
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (t4)
+; CHECK-NEXT:    li t4, 366
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s4
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a7)
+; CHECK-NEXT:    lui a6, 1
+; CHECK-NEXT:    addiw a6, a6, 1905
+; CHECK-NEXT:    add a6, sp, a6
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, t3
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a2)
+; CHECK-NEXT:    li a7, 365
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, t1
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1897
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, t0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a3)
+; CHECK-NEXT:    li a3, 364
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, t5
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a1)
+; CHECK-NEXT:    lui a1, 1
+; CHECK-NEXT:    addiw a1, a1, 1889
+; CHECK-NEXT:    add a1, sp, a1
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, s3
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a4)
+; CHECK-NEXT:    li a4, 363
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, t2
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a5)
+; CHECK-NEXT:    lui a2, 1
+; CHECK-NEXT:    addiw a2, a2, 1881
+; CHECK-NEXT:    add a2, sp, a2
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, t4
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a6)
+; CHECK-NEXT:    li a5, 362
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a7
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1873
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a3
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a1)
+; CHECK-NEXT:    li a3, 361
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a4
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a2)
+; CHECK-NEXT:    lui a1, 1
+; CHECK-NEXT:    addiw a1, a1, 1865
+; CHECK-NEXT:    add a1, sp, a1
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a5
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a2, 360
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a3
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a1)
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1857
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a2
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 359
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1849
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 358
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1841
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 357
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1833
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 356
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1825
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 355
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1817
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 354
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1809
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 353
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1801
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 352
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1793
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 351
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1785
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 350
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1777
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 349
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1769
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 348
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1761
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 347
+; CHECK-NEXT:    vsetivli zero, 1, e8, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vx v4, v12, a0
+; CHECK-NEXT:    lui a0, 1
+; CHECK-NEXT:    addiw a0, a0, 1753
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vse8.v v4, (a0)
+; CHECK-NEXT:    li a0, 346
+; CHECK-NEXT:    vsetivli zero, 1, ...
[truncated]

@@ -0,0 +1,19477 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
; RUN: llc < %s -mtriple=riscv64 -mattr=+v,+zvl2048b | FileCheck %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is massive, and the output doesn't add much. Maybe convert this into a crash only test (i.e. don't check the output)?

@wangpc-pp
Copy link
Contributor

Can v2048i8 be added to make MVT symmetric?

@topperc
Copy link
Collaborator Author

topperc commented Mar 27, 2025

Can v2048i8 be added to make MVT symmetric?

To support RISC-V we would need to add any of these that are missing: v2048i8, v1024i16, v512i32, v256i64, v1024f16, v512f32, and v256f64. We expect to be able to freely bitcast between vectors with the same size.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@topperc topperc merged commit d131b78 into llvm:main Mar 28, 2025
11 checks passed
@topperc topperc deleted the pr/i1-2048 branch March 28, 2025 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants