Skip to content

[RISCV] Add codegen support for Zvfbfmin #87911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 45 additions & 5 deletions llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1084,6 +1084,23 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
}
}

// TODO: Could we merge some code with zvfhmin?
if (Subtarget.hasVInstructionsBF16()) {
for (MVT VT : BF16VecVTs) {
if (!isTypeLegal(VT))
continue;
setOperationAction({ISD::FP_ROUND, ISD::FP_EXTEND}, VT, Custom);
setOperationAction({ISD::VP_FP_ROUND, ISD::VP_FP_EXTEND}, VT, Custom);
setOperationAction({ISD::STRICT_FP_ROUND, ISD::STRICT_FP_EXTEND}, VT,
Custom);
setOperationAction({ISD::CONCAT_VECTORS, ISD::INSERT_SUBVECTOR,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are concat, insert, extract, and scalar_to_vector tested? It's not immediately obvious from the test names

Copy link
Contributor Author

@jacquesguan jacquesguan Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added test cases for insert_vectors and extract_vectors, and removed scalar_to_vector now since it should need bf16 scalar extension. For concat_vectors, I didn't find the direct case of it.

ISD::EXTRACT_SUBVECTOR},
VT, Custom);
setOperationAction({ISD::LOAD, ISD::STORE}, VT, Custom);
// TODO: Promote to fp32.
Copy link
Contributor

@michaelmaitland michaelmaitland Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like F16Minimal has support for STRICT_FP_ROUND, STRICT_FP_EXTEND, VP_MERGE, VP_SELECT, SELECT, SELECT_CC, SINT_TO_FP, UINT_TO_FP, VP_SINT_TO_FP, VP_UINT_TO_FP, and SPLAT_VECTOR but these are not included for BF16. Do you plan to add support for these?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zvfbfmin does not have any integer conversion instructions so SINT_TO_FP, UINT_TO_FP, VP_SINT_TO_FP, VP_UINT_TO_FP can be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what SPLAT_VECTOR for F16Minimal does since the scalar FP16 type isn't guaranteed legal with F16Minimal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPLAT_VECTOR and select/merge with splat depends on Zvfh, so I plan to add these in a separate patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what SPLAT_VECTOR for F16Minimal does since the scalar FP16 type isn't guaranteed legal with F16Minimal.

I create a pr #88275 to add the constraint for SPLAT_VECTOR.

Copy link
Contributor

@michaelmaitland michaelmaitland Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the comments in this thread, we have been able to trim down the original list. We are left with consideration of STRICT_FP_ROUND, STRICT_FP_EXTEND. Are we able to address those opcodes here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the comments in this thread, we have been able to trim down the original list. We are left with consideration of STRICT_FP_ROUND, STRICT_FP_EXTEND. Are we able to address those opcodes here?

Added STRICT_FP_ROUND, STRICT_FP_EXTEND now.

}
}

if (Subtarget.hasVInstructionsF32()) {
for (MVT VT : F32VecVTs) {
if (!isTypeLegal(VT))
Expand Down Expand Up @@ -1299,6 +1316,19 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
continue;
}

if (VT.getVectorElementType() == MVT::bf16) {
setOperationAction({ISD::FP_ROUND, ISD::FP_EXTEND}, VT, Custom);
setOperationAction({ISD::VP_FP_ROUND, ISD::VP_FP_EXTEND}, VT, Custom);
setOperationAction({ISD::STRICT_FP_ROUND, ISD::STRICT_FP_EXTEND}, VT,
Custom);
setOperationAction({ISD::CONCAT_VECTORS, ISD::INSERT_SUBVECTOR,
ISD::EXTRACT_SUBVECTOR},
VT, Custom);
setOperationAction({ISD::LOAD, ISD::STORE}, VT, Custom);
// TODO: Promote to fp32.
continue;
}

// We use EXTRACT_SUBVECTOR as a "cast" from scalable to fixed.
setOperationAction({ISD::INSERT_SUBVECTOR, ISD::EXTRACT_SUBVECTOR}, VT,
Custom);
Expand Down Expand Up @@ -2558,6 +2588,10 @@ static bool useRVVForFixedLengthVectorVT(MVT VT,
if (!Subtarget.hasVInstructionsF16Minimal())
return false;
break;
case MVT::bf16:
if (!Subtarget.hasVInstructionsBF16())
return false;
break;
case MVT::f32:
if (!Subtarget.hasVInstructionsF32())
return false;
Expand Down Expand Up @@ -2609,6 +2643,7 @@ static MVT getContainerForFixedLengthVector(const TargetLowering &TLI, MVT VT,
case MVT::i16:
case MVT::i32:
case MVT::i64:
case MVT::bf16:
case MVT::f16:
case MVT::f32:
case MVT::f64: {
Expand Down Expand Up @@ -8057,8 +8092,10 @@ RISCVTargetLowering::lowerStrictFPExtendOrRoundLike(SDValue Op,

// RVV can only widen/truncate fp to types double/half the size as the source.
if ((VT.getVectorElementType() == MVT::f64 &&
SrcVT.getVectorElementType() == MVT::f16) ||
(VT.getVectorElementType() == MVT::f16 &&
(SrcVT.getVectorElementType() == MVT::f16 ||
SrcVT.getVectorElementType() == MVT::bf16)) ||
((VT.getVectorElementType() == MVT::f16 ||
VT.getVectorElementType() == MVT::bf16) &&
SrcVT.getVectorElementType() == MVT::f64)) {
// For double rounding, the intermediate rounding should be round-to-odd.
unsigned InterConvOpc = Op.getOpcode() == ISD::STRICT_FP_EXTEND
Expand Down Expand Up @@ -8102,9 +8139,12 @@ RISCVTargetLowering::lowerVectorFPExtendOrRoundLike(SDValue Op,
SDValue Src = Op.getOperand(0);
MVT SrcVT = Src.getSimpleValueType();

bool IsDirectExtend = IsExtend && (VT.getVectorElementType() != MVT::f64 ||
SrcVT.getVectorElementType() != MVT::f16);
bool IsDirectTrunc = !IsExtend && (VT.getVectorElementType() != MVT::f16 ||
bool IsDirectExtend =
IsExtend && (VT.getVectorElementType() != MVT::f64 ||
(SrcVT.getVectorElementType() != MVT::f16 &&
SrcVT.getVectorElementType() != MVT::bf16));
bool IsDirectTrunc = !IsExtend && ((VT.getVectorElementType() != MVT::f16 &&
VT.getVectorElementType() != MVT::bf16) ||
SrcVT.getVectorElementType() != MVT::f64);

bool IsDirectConv = IsDirectExtend || IsDirectTrunc;
Expand Down
32 changes: 16 additions & 16 deletions llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
Original file line number Diff line number Diff line change
Expand Up @@ -355,24 +355,24 @@ defset list<VTypeInfo> AllVectors = {
V_M8, f64, FPR64>;
}
}
}

defset list<VTypeInfo> AllBFloatVectors = {
defset list<VTypeInfo> NoGroupBFloatVectors = {
defset list<VTypeInfo> FractionalGroupBFloatVectors = {
def VBF16MF4: VTypeInfo<vbfloat16mf4_t, vbool64_t, 16, V_MF4, bf16, FPR16>;
def VBF16MF2: VTypeInfo<vbfloat16mf2_t, vbool32_t, 16, V_MF2, bf16, FPR16>;
defset list<VTypeInfo> AllBFloatVectors = {
defset list<VTypeInfo> NoGroupBFloatVectors = {
defset list<VTypeInfo> FractionalGroupBFloatVectors = {
def VBF16MF4: VTypeInfo<vbfloat16mf4_t, vbool64_t, 16, V_MF4, bf16, FPR16>;
def VBF16MF2: VTypeInfo<vbfloat16mf2_t, vbool32_t, 16, V_MF2, bf16, FPR16>;
}
def VBF16M1: VTypeInfo<vbfloat16m1_t, vbool16_t, 16, V_M1, bf16, FPR16>;
}

defset list<GroupVTypeInfo> GroupBFloatVectors = {
def VBF16M2: GroupVTypeInfo<vbfloat16m2_t, vbfloat16m1_t, vbool8_t, 16,
V_M2, bf16, FPR16>;
def VBF16M4: GroupVTypeInfo<vbfloat16m4_t, vbfloat16m1_t, vbool4_t, 16,
V_M4, bf16, FPR16>;
def VBF16M8: GroupVTypeInfo<vbfloat16m8_t, vbfloat16m1_t, vbool2_t, 16,
V_M8, bf16, FPR16>;
}
def VBF16M1: VTypeInfo<vbfloat16m1_t, vbool16_t, 16, V_M1, bf16, FPR16>;
}

defset list<GroupVTypeInfo> GroupBFloatVectors = {
def VBF16M2: GroupVTypeInfo<vbfloat16m2_t, vbfloat16m1_t, vbool8_t, 16,
V_M2, bf16, FPR16>;
def VBF16M4: GroupVTypeInfo<vbfloat16m4_t, vbfloat16m1_t, vbool4_t, 16,
V_M4, bf16, FPR16>;
def VBF16M8: GroupVTypeInfo<vbfloat16m8_t, vbfloat16m1_t, vbool2_t, 16,
V_M8, bf16, FPR16>;
}
}

Expand Down
14 changes: 14 additions & 0 deletions llvm/lib/Target/RISCV/RISCVInstrInfoVSDPatterns.td
Original file line number Diff line number Diff line change
Expand Up @@ -1495,6 +1495,20 @@ foreach fvtiToFWti = AllWidenableFloatVectors in {
fvti.AVL, fvti.Log2SEW, TA_MA)>;
}

foreach fvtiToFWti = AllWidenableBFloatToFloatVectors in {
defvar fvti = fvtiToFWti.Vti;
defvar fwti = fvtiToFWti.Wti;
let Predicates = [HasVInstructionsBF16] in
def : Pat<(fvti.Vector (fpround (fwti.Vector fwti.RegClass:$rs1))),
(!cast<Instruction>("PseudoVFNCVTBF16_F_F_W_"#fvti.LMul.MX#"_E"#fvti.SEW)
(fvti.Vector (IMPLICIT_DEF)),
fwti.RegClass:$rs1,
// Value to indicate no rounding mode change in
// RISCVInsertReadWriteCSR
FRM_DYN,
fvti.AVL, fvti.Log2SEW, TA_MA)>;
}

//===----------------------------------------------------------------------===//
// Vector Splats
//===----------------------------------------------------------------------===//
Expand Down
30 changes: 30 additions & 0 deletions llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
Original file line number Diff line number Diff line change
Expand Up @@ -2670,6 +2670,20 @@ foreach fvtiToFWti = AllWidenableFloatVectors in {
GPR:$vl, fvti.Log2SEW, TA_MA)>;
}

foreach fvtiToFWti = AllWidenableBFloatToFloatVectors in {
defvar fvti = fvtiToFWti.Vti;
defvar fwti = fvtiToFWti.Wti;
let Predicates = [HasVInstructionsBF16] in
def : Pat<(fwti.Vector (any_riscv_fpextend_vl
(fvti.Vector fvti.RegClass:$rs1),
(fvti.Mask V0),
VLOpFrag)),
(!cast<Instruction>("PseudoVFWCVTBF16_F_F_V_"#fvti.LMul.MX#"_E"#fvti.SEW#"_MASK")
(fwti.Vector (IMPLICIT_DEF)), fvti.RegClass:$rs1,
(fvti.Mask V0),
GPR:$vl, fvti.Log2SEW, TA_MA)>;
}

// 13.19 Narrowing Floating-Point/Integer Type-Convert Instructions
defm : VPatNConvertFP2IVL_W_RM<riscv_vfcvt_xu_f_vl, "PseudoVFNCVT_XU_F_W">;
defm : VPatNConvertFP2IVL_W_RM<riscv_vfcvt_x_f_vl, "PseudoVFNCVT_X_F_W">;
Expand Down Expand Up @@ -2714,6 +2728,22 @@ foreach fvtiToFWti = AllWidenableFloatVectors in {
}
}

foreach fvtiToFWti = AllWidenableBFloatToFloatVectors in {
defvar fvti = fvtiToFWti.Vti;
defvar fwti = fvtiToFWti.Wti;
let Predicates = [HasVInstructionsBF16] in
def : Pat<(fvti.Vector (any_riscv_fpround_vl
(fwti.Vector fwti.RegClass:$rs1),
(fwti.Mask V0), VLOpFrag)),
(!cast<Instruction>("PseudoVFNCVTBF16_F_F_W_"#fvti.LMul.MX#"_E"#fvti.SEW#"_MASK")
(fvti.Vector (IMPLICIT_DEF)), fwti.RegClass:$rs1,
(fwti.Mask V0),
// Value to indicate no rounding mode change in
// RISCVInsertReadWriteCSR
FRM_DYN,
GPR:$vl, fvti.Log2SEW, TA_MA)>;
}

// 14. Vector Reduction Operations

// 14.1. Vector Single-Width Integer Reduction Instructions
Expand Down
58 changes: 56 additions & 2 deletions llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple riscv32 -mattr=+m,+d,+zfh,+zvfh,+v -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple riscv64 -mattr=+m,+d,+zfh,+zvfh,+v -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple riscv32 -mattr=+m,+d,+zfh,+zvfh,+v,+experimental-zvfbfmin -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple riscv64 -mattr=+m,+d,+zfh,+zvfh,+v,+experimental-zvfbfmin -verify-machineinstrs < %s | FileCheck %s

define <vscale x 4 x i32> @extract_nxv8i32_nxv4i32_0(<vscale x 8 x i32> %vec) {
; CHECK-LABEL: extract_nxv8i32_nxv4i32_0:
Expand Down Expand Up @@ -481,6 +481,60 @@ define <vscale x 6 x half> @extract_nxv6f16_nxv12f16_6(<vscale x 12 x half> %in)
ret <vscale x 6 x half> %res
}

define <vscale x 2 x bfloat> @extract_nxv2bf16_nxv16bf16_0(<vscale x 16 x bfloat> %vec) {
; CHECK-LABEL: extract_nxv2bf16_nxv16bf16_0:
; CHECK: # %bb.0:
; CHECK-NEXT: ret
%c = call <vscale x 2 x bfloat> @llvm.vector.extract.nxv2bf16.nxv16bf16(<vscale x 16 x bfloat> %vec, i64 0)
ret <vscale x 2 x bfloat> %c
}

define <vscale x 2 x bfloat> @extract_nxv2bf16_nxv16bf16_2(<vscale x 16 x bfloat> %vec) {
; CHECK-LABEL: extract_nxv2bf16_nxv16bf16_2:
; CHECK: # %bb.0:
; CHECK-NEXT: csrr a0, vlenb
; CHECK-NEXT: srli a0, a0, 2
; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
; CHECK-NEXT: vslidedown.vx v8, v8, a0
; CHECK-NEXT: ret
%c = call <vscale x 2 x bfloat> @llvm.vector.extract.nxv2bf16.nxv16bf16(<vscale x 16 x bfloat> %vec, i64 2)
ret <vscale x 2 x bfloat> %c
}

define <vscale x 2 x bfloat> @extract_nxv2bf16_nxv16bf16_4(<vscale x 16 x bfloat> %vec) {
; CHECK-LABEL: extract_nxv2bf16_nxv16bf16_4:
; CHECK: # %bb.0:
; CHECK-NEXT: vmv1r.v v8, v9
; CHECK-NEXT: ret
%c = call <vscale x 2 x bfloat> @llvm.vector.extract.nxv2bf16.nxv16bf16(<vscale x 16 x bfloat> %vec, i64 4)
ret <vscale x 2 x bfloat> %c
}

define <vscale x 6 x bfloat> @extract_nxv6bf16_nxv12bf16_0(<vscale x 12 x bfloat> %in) {
; CHECK-LABEL: extract_nxv6bf16_nxv12bf16_0:
; CHECK: # %bb.0:
; CHECK-NEXT: ret
%res = call <vscale x 6 x bfloat> @llvm.vector.extract.nxv6bf16.nxv12bf16(<vscale x 12 x bfloat> %in, i64 0)
ret <vscale x 6 x bfloat> %res
}

define <vscale x 6 x bfloat> @extract_nxv6bf16_nxv12bf16_6(<vscale x 12 x bfloat> %in) {
; CHECK-LABEL: extract_nxv6bf16_nxv12bf16_6:
; CHECK: # %bb.0:
; CHECK-NEXT: csrr a0, vlenb
; CHECK-NEXT: srli a0, a0, 2
; CHECK-NEXT: vsetvli a1, zero, e16, m1, ta, ma
; CHECK-NEXT: vslidedown.vx v13, v10, a0
; CHECK-NEXT: vslidedown.vx v12, v9, a0
; CHECK-NEXT: add a1, a0, a0
; CHECK-NEXT: vsetvli zero, a1, e16, m1, ta, ma
; CHECK-NEXT: vslideup.vx v12, v10, a0
; CHECK-NEXT: vmv2r.v v8, v12
; CHECK-NEXT: ret
%res = call <vscale x 6 x bfloat> @llvm.vector.extract.nxv6bf16.nxv12bf16(<vscale x 12 x bfloat> %in, i64 6)
ret <vscale x 6 x bfloat> %res
}

declare <vscale x 6 x half> @llvm.vector.extract.nxv6f16.nxv12f16(<vscale x 12 x half>, i64)

declare <vscale x 1 x i8> @llvm.vector.extract.nxv1i8.nxv4i8(<vscale x 4 x i8> %vec, i64 %idx)
Expand Down
58 changes: 54 additions & 4 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fpext-vp.ll
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfh,+v -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfh,+v -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfhmin,+v -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfhmin,+v -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfh,+v,+experimental-zvfbfmin -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfh,+v,+experimental-zvfbfmin -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=riscv32 -mattr=+d,+zfh,+zvfhmin,+v,+experimental-zvfbfmin -verify-machineinstrs < %s | FileCheck %s
; RUN: llc -mtriple=riscv64 -mattr=+d,+zfh,+zvfhmin,+v,+experimental-zvfbfmin -verify-machineinstrs < %s | FileCheck %s

declare <2 x float> @llvm.vp.fpext.v2f32.v2f16(<2 x half>, <2 x i1>, i32)

Expand Down Expand Up @@ -120,3 +120,53 @@ define <32 x double> @vfpext_v32f32_v32f64(<32 x float> %a, <32 x i1> %m, i32 ze
%v = call <32 x double> @llvm.vp.fpext.v32f64.v32f32(<32 x float> %a, <32 x i1> %m, i32 %vl)
ret <32 x double> %v
}

declare <2 x float> @llvm.vp.fpext.v2f32.v2bf16(<2 x bfloat>, <2 x i1>, i32)

define <2 x float> @vfpext_v2bf16_v2f32(<2 x bfloat> %a, <2 x i1> %m, i32 zeroext %vl) {
; CHECK-LABEL: vfpext_v2bf16_v2f32:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetvli zero, a0, e16, mf4, ta, ma
; CHECK-NEXT: vfwcvtbf16.f.f.v v9, v8, v0.t
; CHECK-NEXT: vmv1r.v v8, v9
; CHECK-NEXT: ret
%v = call <2 x float> @llvm.vp.fpext.v2f32.v2bf16(<2 x bfloat> %a, <2 x i1> %m, i32 %vl)
ret <2 x float> %v
}

define <2 x float> @vfpext_v2bf16_v2f32_unmasked(<2 x bfloat> %a, i32 zeroext %vl) {
; CHECK-LABEL: vfpext_v2bf16_v2f32_unmasked:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetvli zero, a0, e16, mf4, ta, ma
; CHECK-NEXT: vfwcvtbf16.f.f.v v9, v8
; CHECK-NEXT: vmv1r.v v8, v9
; CHECK-NEXT: ret
%v = call <2 x float> @llvm.vp.fpext.v2f32.v2bf16(<2 x bfloat> %a, <2 x i1> shufflevector (<2 x i1> insertelement (<2 x i1> undef, i1 true, i32 0), <2 x i1> undef, <2 x i32> zeroinitializer), i32 %vl)
ret <2 x float> %v
}

declare <2 x double> @llvm.vp.fpext.v2f64.v2bf16(<2 x bfloat>, <2 x i1>, i32)

define <2 x double> @vfpext_v2bf16_v2f64(<2 x bfloat> %a, <2 x i1> %m, i32 zeroext %vl) {
; CHECK-LABEL: vfpext_v2bf16_v2f64:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetvli zero, a0, e16, mf4, ta, ma
; CHECK-NEXT: vfwcvtbf16.f.f.v v9, v8, v0.t
; CHECK-NEXT: vsetvli zero, zero, e32, mf2, ta, ma
; CHECK-NEXT: vfwcvt.f.f.v v8, v9, v0.t
; CHECK-NEXT: ret
%v = call <2 x double> @llvm.vp.fpext.v2f64.v2bf16(<2 x bfloat> %a, <2 x i1> %m, i32 %vl)
ret <2 x double> %v
}

define <2 x double> @vfpext_v2bf16_v2f64_unmasked(<2 x bfloat> %a, i32 zeroext %vl) {
; CHECK-LABEL: vfpext_v2bf16_v2f64_unmasked:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetvli zero, a0, e16, mf4, ta, ma
; CHECK-NEXT: vfwcvtbf16.f.f.v v9, v8
; CHECK-NEXT: vsetvli zero, zero, e32, mf2, ta, ma
; CHECK-NEXT: vfwcvt.f.f.v v8, v9
; CHECK-NEXT: ret
%v = call <2 x double> @llvm.vp.fpext.v2f64.v2bf16(<2 x bfloat> %a, <2 x i1> shufflevector (<2 x i1> insertelement (<2 x i1> undef, i1 true, i32 0), <2 x i1> undef, <2 x i32> zeroinitializer), i32 %vl)
ret <2 x double> %v
}
Loading