Skip to content

PowerPC/VSX: Select FMINNUM_IEEE and FMAXNUM_IEEE #112195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions llvm/lib/Target/PowerPC/PPCISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -775,6 +775,10 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setOperationAction(ISD::FMAXNUM_IEEE, MVT::f32, Legal);
setOperationAction(ISD::FMINNUM_IEEE, MVT::f64, Legal);
setOperationAction(ISD::FMINNUM_IEEE, MVT::f32, Legal);
setOperationAction(ISD::FMAXNUM, MVT::f64, Legal);
setOperationAction(ISD::FMAXNUM, MVT::f32, Legal);
setOperationAction(ISD::FMINNUM, MVT::f64, Legal);
setOperationAction(ISD::FMINNUM, MVT::f32, Legal);
Comment on lines +778 to +781
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is beyond what the title suggests. Do all subtargets have the ignore signaling behavior, and only vsx has the ieee behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is beyond what the title suggests.

It's a bootstrap problem:
If we fix FMAXNUM_IEEE only, some of test cases will fail.
The problem is that currently FMINNUM is expanded to
FCANONICALIZE + FCANONICALIZE + FMINNUM_IEEE
while FCANONICALIZE is not defined by here: I will submit it in a future patch.

Do all subtargets have the ignore signaling behavior, and only vsx has the ieee behavior?

It seems that PowerPC doesn't have min/max operation in its scale part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fix expandFMINNUMFMAXNUM first, some PowerPC test cases will fails with complains that we cannot legalize FMINNUM_IEEE.

Copy link
Contributor Author

@wzssyqa wzssyqa Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my plan to break this bootstrap problem is:

  1. mark Legal to ISD::FMINNUM for PowerPC/VSX.
    add FMINNUM_IEEE to PowerPC/VSX.
  2. add FCANONICALIZE support to PowerPC/VSX.
  3. Fix expandFMINNUMFMAXNUM to drop FCANONICALIZE if FMINNUM_IEEE is available.
  4. Add test cases of fmininumnum to PowerPC/VSX.
  5. Drop setOperationAction(ISD::FMINNUM, MVT::f32, Legal); from PowerPC/VSX.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with the break down here is that it will leave things in an invalid state. Shouldn't step 1-4 be done together for the compiler behaviour to be valid? At the minimun, step 4, test casess should be added with the corresponding code change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ask @arsenm. I was asked to split these changes when I was working on other architectures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm ping

}

if (Subtarget.hasAltivec()) {
Expand Down Expand Up @@ -809,6 +813,8 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
if (Subtarget.hasVSX()) {
setOperationAction(ISD::FMAXNUM, VT, Legal);
setOperationAction(ISD::FMINNUM, VT, Legal);
setOperationAction(ISD::FMAXNUM_IEEE, VT, Legal);
setOperationAction(ISD::FMINNUM_IEEE, VT, Legal);
}

// Vector instructions introduced in P8
Expand Down
33 changes: 13 additions & 20 deletions llvm/lib/Target/PowerPC/PPCInstrVSX.td
Original file line number Diff line number Diff line change
Expand Up @@ -2722,6 +2722,15 @@ def : Pat<(v2f64 (any_fmaxnum v2f64:$src1, v2f64:$src2)),
def : Pat<(v2f64 (any_fminnum v2f64:$src1, v2f64:$src2)),
(v2f64 (XVMINDP $src1, $src2))>;

def : Pat<(v4f32 (fmaxnum_ieee v4f32:$src1, v4f32:$src2)),
(v4f32 (XVMAXSP $src1, $src2))>;
def : Pat<(v4f32 (fminnum_ieee v4f32:$src1, v4f32:$src2)),
(v4f32 (XVMINSP $src1, $src2))>;
def : Pat<(v2f64 (fmaxnum_ieee v2f64:$src1, v2f64:$src2)),
(v2f64 (XVMAXDP $src1, $src2))>;
def : Pat<(v2f64 (fminnum_ieee v2f64:$src1, v2f64:$src2)),
(v2f64 (XVMINDP $src1, $src2))>;

// f32 abs
def : Pat<(f32 (fabs f32:$S)),
(f32 (COPY_TO_REGCLASS (XSABSDP
Expand All @@ -2735,39 +2744,23 @@ def : Pat<(f32 (fneg (fabs f32:$S))),
// f32 Min.
def : Pat<(f32 (fminnum_ieee f32:$A, f32:$B)),
(f32 FpMinMax.F32Min)>;
def : Pat<(f32 (fminnum_ieee (fcanonicalize f32:$A), f32:$B)),
(f32 FpMinMax.F32Min)>;
def : Pat<(f32 (fminnum_ieee f32:$A, (fcanonicalize f32:$B))),
(f32 FpMinMax.F32Min)>;
Comment on lines -2738 to -2741
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand these patterns folding out a canonicalize on each individual operand. If FpMinMax.F32Min has the IEEE snan behavior, both need to be guaranteed quiet. Can these be removed as a precommit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove them will fail some current test cases.

def : Pat<(f32 (fminnum_ieee (fcanonicalize f32:$A), (fcanonicalize f32:$B))),
def : Pat<(f32 (fminnum f32:$A, f32:$B)),
(f32 FpMinMax.F32Min)>;
// F32 Max.
def : Pat<(f32 (fmaxnum_ieee f32:$A, f32:$B)),
(f32 FpMinMax.F32Max)>;
def : Pat<(f32 (fmaxnum_ieee (fcanonicalize f32:$A), f32:$B)),
(f32 FpMinMax.F32Max)>;
def : Pat<(f32 (fmaxnum_ieee f32:$A, (fcanonicalize f32:$B))),
(f32 FpMinMax.F32Max)>;
def : Pat<(f32 (fmaxnum_ieee (fcanonicalize f32:$A), (fcanonicalize f32:$B))),
def : Pat<(f32 (fmaxnum f32:$A, f32:$B)),
(f32 FpMinMax.F32Max)>;

// f64 Min.
def : Pat<(f64 (fminnum_ieee f64:$A, f64:$B)),
(f64 (XSMINDP $A, $B))>;
def : Pat<(f64 (fminnum_ieee (fcanonicalize f64:$A), f64:$B)),
(f64 (XSMINDP $A, $B))>;
def : Pat<(f64 (fminnum_ieee f64:$A, (fcanonicalize f64:$B))),
(f64 (XSMINDP $A, $B))>;
def : Pat<(f64 (fminnum_ieee (fcanonicalize f64:$A), (fcanonicalize f64:$B))),
def : Pat<(f64 (fminnum f64:$A, f64:$B)),
(f64 (XSMINDP $A, $B))>;
// f64 Max.
def : Pat<(f64 (fmaxnum_ieee f64:$A, f64:$B)),
(f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), f64:$B)),
(f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee f64:$A, (fcanonicalize f64:$B))),
(f64 (XSMAXDP $A, $B))>;
def : Pat<(f64 (fmaxnum_ieee (fcanonicalize f64:$A), (fcanonicalize f64:$B))),
def : Pat<(f64 (fmaxnum f64:$A, f64:$B)),
(f64 (XSMAXDP $A, $B))>;

def : Pat<(int_ppc_vsx_stxvd2x_be v2f64:$rS, ForceXForm:$dst),
Expand Down
133 changes: 34 additions & 99 deletions llvm/test/CodeGen/PowerPC/fminimum-fmaximum.ll
Original file line number Diff line number Diff line change
Expand Up @@ -301,44 +301,26 @@ define <4 x float> @v4f32_minimum(<4 x float> %a, <4 x float> %b) {
; VSX-NEXT: xvcmpeqsp 1, 35, 35
; VSX-NEXT: xvcmpeqsp 2, 34, 34
; VSX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
; VSX-NEXT: xxleqv 36, 36, 36
; VSX-NEXT: xvminsp 0, 34, 35
; VSX-NEXT: vslw 4, 4, 4
; VSX-NEXT: addi 3, 3, .LCPI4_0@toc@l
; VSX-NEXT: xxlnor 1, 1, 1
; VSX-NEXT: xxlnor 2, 2, 2
; VSX-NEXT: vcmpequw 5, 2, 4
; VSX-NEXT: xvminsp 0, 34, 35
; VSX-NEXT: xxlor 1, 2, 1
; VSX-NEXT: lxvd2x 2, 0, 3
; VSX-NEXT: xxsel 0, 0, 2, 1
; VSX-NEXT: xxlxor 2, 2, 2
; VSX-NEXT: xvcmpeqsp 2, 0, 2
; VSX-NEXT: xxsel 1, 0, 34, 37
; VSX-NEXT: vcmpequw 2, 3, 4
; VSX-NEXT: xxsel 1, 1, 35, 34
; VSX-NEXT: xxsel 34, 0, 1, 2
; VSX-NEXT: xxsel 34, 0, 2, 1
; VSX-NEXT: blr
;
; AIX-LABEL: v4f32_minimum:
; AIX: # %bb.0: # %entry
; AIX-NEXT: xvcmpeqsp 1, 35, 35
; AIX-NEXT: xvcmpeqsp 2, 34, 34
; AIX-NEXT: ld 3, L..C4(2) # %const.0
; AIX-NEXT: xxleqv 36, 36, 36
; AIX-NEXT: xvminsp 0, 34, 35
; AIX-NEXT: vslw 4, 4, 4
; AIX-NEXT: xxlnor 1, 1, 1
; AIX-NEXT: xxlnor 2, 2, 2
; AIX-NEXT: vcmpequw 5, 2, 4
; AIX-NEXT: xxlor 1, 2, 1
; AIX-NEXT: lxvw4x 2, 0, 3
; AIX-NEXT: xxsel 0, 0, 2, 1
; AIX-NEXT: xxlxor 2, 2, 2
; AIX-NEXT: xvcmpeqsp 2, 0, 2
; AIX-NEXT: xxsel 1, 0, 34, 37
; AIX-NEXT: vcmpequw 2, 3, 4
; AIX-NEXT: xxsel 1, 1, 35, 34
; AIX-NEXT: xxsel 34, 0, 1, 2
; AIX-NEXT: xxsel 34, 0, 2, 1
; AIX-NEXT: blr
entry:
%m = call <4 x float> @llvm.minimum.v4f32(<4 x float> %a, <4 x float> %b)
Expand Down Expand Up @@ -377,16 +359,9 @@ define <4 x float> @v4f32_maximum(<4 x float> %a, <4 x float> %b) {
; VSX-NEXT: xxlnor 1, 1, 1
; VSX-NEXT: xxlnor 2, 2, 2
; VSX-NEXT: xvmaxsp 0, 34, 35
; VSX-NEXT: xxlxor 36, 36, 36
; VSX-NEXT: vcmpequw 5, 2, 4
; VSX-NEXT: xxlor 1, 2, 1
; VSX-NEXT: lxvd2x 2, 0, 3
; VSX-NEXT: xxsel 0, 0, 2, 1
; VSX-NEXT: xvcmpeqsp 2, 0, 36
; VSX-NEXT: xxsel 1, 0, 34, 37
; VSX-NEXT: vcmpequw 2, 3, 4
; VSX-NEXT: xxsel 1, 1, 35, 34
; VSX-NEXT: xxsel 34, 0, 1, 2
; VSX-NEXT: xxsel 34, 0, 2, 1
; VSX-NEXT: blr
;
; AIX-LABEL: v4f32_maximum:
Expand All @@ -395,18 +370,11 @@ define <4 x float> @v4f32_maximum(<4 x float> %a, <4 x float> %b) {
; AIX-NEXT: xvcmpeqsp 2, 34, 34
; AIX-NEXT: ld 3, L..C5(2) # %const.0
; AIX-NEXT: xvmaxsp 0, 34, 35
; AIX-NEXT: xxlxor 36, 36, 36
; AIX-NEXT: xxlnor 1, 1, 1
; AIX-NEXT: xxlnor 2, 2, 2
; AIX-NEXT: vcmpequw 5, 2, 4
; AIX-NEXT: xxlor 1, 2, 1
; AIX-NEXT: lxvw4x 2, 0, 3
; AIX-NEXT: xxsel 0, 0, 2, 1
; AIX-NEXT: xvcmpeqsp 2, 0, 36
; AIX-NEXT: xxsel 1, 0, 34, 37
; AIX-NEXT: vcmpequw 2, 3, 4
; AIX-NEXT: xxsel 1, 1, 35, 34
; AIX-NEXT: xxsel 34, 0, 1, 2
; AIX-NEXT: xxsel 34, 0, 2, 1
; AIX-NEXT: blr
entry:
%m = call <4 x float> @llvm.maximum.v4f32(<4 x float> %a, <4 x float> %b)
Expand Down Expand Up @@ -493,47 +461,28 @@ define <2 x double> @v2f64_minimum(<2 x double> %a, <2 x double> %b) {
; VSX-LABEL: v2f64_minimum:
; VSX: # %bb.0: # %entry
; VSX-NEXT: addis 3, 2, .LCPI6_0@toc@ha
; VSX-NEXT: xvcmpeqdp 36, 35, 35
; VSX-NEXT: xvcmpeqdp 37, 34, 34
; VSX-NEXT: addi 3, 3, .LCPI6_0@toc@l
; VSX-NEXT: xxlnor 36, 36, 36
; VSX-NEXT: xxlnor 37, 37, 37
; VSX-NEXT: xvmindp 0, 34, 35
; VSX-NEXT: xvcmpeqdp 35, 35, 35
; VSX-NEXT: addi 3, 3, .LCPI6_0@toc@l
; VSX-NEXT: xvcmpeqdp 34, 34, 34
; VSX-NEXT: xxlnor 35, 35, 35
; VSX-NEXT: xxlnor 34, 34, 34
; VSX-NEXT: lxvd2x 2, 0, 3
; VSX-NEXT: addis 3, 2, .LCPI6_1@toc@ha
; VSX-NEXT: xxlor 1, 37, 36
; VSX-NEXT: addi 3, 3, .LCPI6_1@toc@l
; VSX-NEXT: lxvd2x 36, 0, 3
; VSX-NEXT: vcmpequd 5, 2, 4
; VSX-NEXT: xxsel 0, 0, 2, 1
; VSX-NEXT: xxlxor 2, 2, 2
; VSX-NEXT: xxsel 1, 0, 34, 37
; VSX-NEXT: vcmpequd 2, 3, 4
; VSX-NEXT: xxsel 1, 1, 35, 34
; VSX-NEXT: xvcmpeqdp 34, 0, 2
; VSX-NEXT: xxsel 34, 0, 1, 34
; VSX-NEXT: xxlor 1, 34, 35
; VSX-NEXT: xxsel 34, 0, 2, 1
; VSX-NEXT: blr
;
; AIX-LABEL: v2f64_minimum:
; AIX: # %bb.0: # %entry
; AIX-NEXT: ld 3, L..C6(2) # %const.0
; AIX-NEXT: xvcmpeqdp 36, 35, 35
; AIX-NEXT: xvcmpeqdp 37, 34, 34
; AIX-NEXT: lxvd2x 2, 0, 3
; AIX-NEXT: ld 3, L..C7(2) # %const.1
; AIX-NEXT: xxlnor 36, 36, 36
; AIX-NEXT: xxlnor 37, 37, 37
; AIX-NEXT: xvmindp 0, 34, 35
; AIX-NEXT: xxlor 1, 37, 36
; AIX-NEXT: lxvd2x 36, 0, 3
; AIX-NEXT: vcmpequd 5, 2, 4
; AIX-NEXT: xxsel 0, 0, 2, 1
; AIX-NEXT: xxlxor 2, 2, 2
; AIX-NEXT: xxsel 1, 0, 34, 37
; AIX-NEXT: vcmpequd 2, 3, 4
; AIX-NEXT: xxsel 1, 1, 35, 34
; AIX-NEXT: xvcmpeqdp 34, 0, 2
; AIX-NEXT: xxsel 34, 0, 1, 34
; AIX-NEXT: xvcmpeqdp 35, 35, 35
; AIX-NEXT: lxvd2x 2, 0, 3
; AIX-NEXT: xvcmpeqdp 34, 34, 34
; AIX-NEXT: xxlnor 35, 35, 35
; AIX-NEXT: xxlnor 34, 34, 34
; AIX-NEXT: xxlor 1, 34, 35
; AIX-NEXT: xxsel 34, 0, 2, 1
; AIX-NEXT: blr
entry:
%m = call <2 x double> @llvm.minimum.v2f64(<2 x double> %a, <2 x double> %b)
Expand Down Expand Up @@ -618,42 +567,28 @@ define <2 x double> @v2f64_maximum(<2 x double> %a, <2 x double> %b) {
; VSX-LABEL: v2f64_maximum:
; VSX: # %bb.0: # %entry
; VSX-NEXT: addis 3, 2, .LCPI7_0@toc@ha
; VSX-NEXT: xvcmpeqdp 36, 35, 35
; VSX-NEXT: xvcmpeqdp 37, 34, 34
; VSX-NEXT: addi 3, 3, .LCPI7_0@toc@l
; VSX-NEXT: xxlnor 36, 36, 36
; VSX-NEXT: xxlnor 37, 37, 37
; VSX-NEXT: xvmaxdp 0, 34, 35
; VSX-NEXT: xvcmpeqdp 35, 35, 35
; VSX-NEXT: addi 3, 3, .LCPI7_0@toc@l
; VSX-NEXT: xvcmpeqdp 34, 34, 34
; VSX-NEXT: xxlnor 35, 35, 35
; VSX-NEXT: xxlnor 34, 34, 34
; VSX-NEXT: lxvd2x 2, 0, 3
; VSX-NEXT: xxlor 1, 37, 36
; VSX-NEXT: xxlxor 36, 36, 36
; VSX-NEXT: vcmpequd 5, 2, 4
; VSX-NEXT: xxsel 0, 0, 2, 1
; VSX-NEXT: xxsel 1, 0, 34, 37
; VSX-NEXT: vcmpequd 2, 3, 4
; VSX-NEXT: xxsel 1, 1, 35, 34
; VSX-NEXT: xvcmpeqdp 34, 0, 36
; VSX-NEXT: xxsel 34, 0, 1, 34
; VSX-NEXT: xxlor 1, 34, 35
; VSX-NEXT: xxsel 34, 0, 2, 1
; VSX-NEXT: blr
;
; AIX-LABEL: v2f64_maximum:
; AIX: # %bb.0: # %entry
; AIX-NEXT: ld 3, L..C8(2) # %const.0
; AIX-NEXT: xvcmpeqdp 36, 35, 35
; AIX-NEXT: xvcmpeqdp 37, 34, 34
; AIX-NEXT: lxvd2x 2, 0, 3
; AIX-NEXT: xxlnor 36, 36, 36
; AIX-NEXT: xxlnor 37, 37, 37
; AIX-NEXT: ld 3, L..C7(2) # %const.0
; AIX-NEXT: xvmaxdp 0, 34, 35
; AIX-NEXT: xxlor 1, 37, 36
; AIX-NEXT: xxlxor 36, 36, 36
; AIX-NEXT: vcmpequd 5, 2, 4
; AIX-NEXT: xxsel 0, 0, 2, 1
; AIX-NEXT: xxsel 1, 0, 34, 37
; AIX-NEXT: vcmpequd 2, 3, 4
; AIX-NEXT: xxsel 1, 1, 35, 34
; AIX-NEXT: xvcmpeqdp 34, 0, 36
; AIX-NEXT: xxsel 34, 0, 1, 34
; AIX-NEXT: xvcmpeqdp 35, 35, 35
; AIX-NEXT: lxvd2x 2, 0, 3
; AIX-NEXT: xvcmpeqdp 34, 34, 34
; AIX-NEXT: xxlnor 35, 35, 35
; AIX-NEXT: xxlnor 34, 34, 34
; AIX-NEXT: xxlor 1, 34, 35
; AIX-NEXT: xxsel 34, 0, 2, 1
; AIX-NEXT: blr
entry:
%m = call <2 x double> @llvm.maximum.v2f64(<2 x double> %a, <2 x double> %b)
Expand Down
20 changes: 8 additions & 12 deletions llvm/test/CodeGen/PowerPC/scalar-min-max.ll
Original file line number Diff line number Diff line change
Expand Up @@ -117,13 +117,12 @@ define dso_local float @testfmax_fast(float %a, float %b) local_unnamed_addr {
;
; NO-FAST-P9-LABEL: testfmax_fast:
; NO-FAST-P9: # %bb.0: # %entry
; NO-FAST-P9-NEXT: xsmaxcdp f1, f1, f2
; NO-FAST-P9-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testfmax_fast:
; NO-FAST-P8: # %bb.0: # %entry
; NO-FAST-P8-NEXT: xssubsp f0, f2, f1
; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
; NO-FAST-P8-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf ogt float %a, %b
Expand All @@ -138,13 +137,12 @@ define dso_local double @testdmax_fast(double %a, double %b) local_unnamed_addr
;
; NO-FAST-P9-LABEL: testdmax_fast:
; NO-FAST-P9: # %bb.0: # %entry
; NO-FAST-P9-NEXT: xsmaxcdp f1, f1, f2
; NO-FAST-P9-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testdmax_fast:
; NO-FAST-P8: # %bb.0: # %entry
; NO-FAST-P8-NEXT: xssubdp f0, f2, f1
; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
; NO-FAST-P8-NEXT: xsmaxdp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf ogt double %a, %b
Expand All @@ -159,13 +157,12 @@ define dso_local float @testfmin_fast(float %a, float %b) local_unnamed_addr {
;
; NO-FAST-P9-LABEL: testfmin_fast:
; NO-FAST-P9: # %bb.0: # %entry
; NO-FAST-P9-NEXT: xsmincdp f1, f1, f2
; NO-FAST-P9-NEXT: xsmindp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testfmin_fast:
; NO-FAST-P8: # %bb.0: # %entry
; NO-FAST-P8-NEXT: xssubsp f0, f1, f2
; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
; NO-FAST-P8-NEXT: xsmindp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf olt float %a, %b
Expand All @@ -180,13 +177,12 @@ define dso_local double @testdmin_fast(double %a, double %b) local_unnamed_addr
;
; NO-FAST-P9-LABEL: testdmin_fast:
; NO-FAST-P9: # %bb.0: # %entry
; NO-FAST-P9-NEXT: xsmincdp f1, f1, f2
; NO-FAST-P9-NEXT: xsmindp f1, f1, f2
; NO-FAST-P9-NEXT: blr
;
; NO-FAST-P8-LABEL: testdmin_fast:
; NO-FAST-P8: # %bb.0: # %entry
; NO-FAST-P8-NEXT: xssubdp f0, f1, f2
; NO-FAST-P8-NEXT: fsel f1, f0, f2, f1
; NO-FAST-P8-NEXT: xsmindp f1, f1, f2
; NO-FAST-P8-NEXT: blr
entry:
%cmp = fcmp nnan ninf olt double %a, %b
Expand Down
Loading