Skip to content

[PowerPC] Add SDNPMemOperand to some nodes #115580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 15, 2024

Conversation

s-barannikov
Copy link
Contributor

Nodes created with getMemIntrinsicNode have memory operands. In order
for operands to be propagated to machine instructions, the nodes should
have SDNPMemOperand property.

Similar to 3c8c385.

Nodes created with `getMemIntrinsicNode` have memory operands. In order
for operands to be propagated to machine instructions, the nodes should
have `SDNPMemOperand` property.
@llvmbot
Copy link
Member

llvmbot commented Nov 9, 2024

@llvm/pr-subscribers-backend-powerpc

Author: Sergei Barannikov (s-barannikov)

Changes

Nodes created with getMemIntrinsicNode have memory operands. In order
for operands to be propagated to machine instructions, the nodes should
have SDNPMemOperand property.

Similar to 3c8c385.


Patch is 72.19 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/115580.diff

29 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+5-5)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+1-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrVSX.td (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/const-nonsplat-array-init.ll (+12-12)
  • (modified) llvm/test/CodeGen/PowerPC/const-splat-array-init.ll (+6-6)
  • (modified) llvm/test/CodeGen/PowerPC/extract-and-store.ll (+3-3)
  • (modified) llvm/test/CodeGen/PowerPC/f128-fma.ll (+8-8)
  • (modified) llvm/test/CodeGen/PowerPC/f128-passByValue.ll (+4-4)
  • (modified) llvm/test/CodeGen/PowerPC/pr45301.ll (+1-1)
  • (modified) llvm/test/CodeGen/PowerPC/pr47891.ll (+1-1)
  • (modified) llvm/test/CodeGen/PowerPC/pr59074.ll (+3-3)
  • (modified) llvm/test/CodeGen/PowerPC/swaps-le-1.ll (+250-80)
  • (modified) llvm/test/CodeGen/PowerPC/vec-itofp.ll (+3-3)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i16_elts.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i64_elts.ll (+10-10)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i16_elts.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i32_elts.ll (+4-4)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp_to_i_4byte_elts.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp_to_i_8byte_elts.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i16_to_fp32_elts.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i16_to_fp64_elts.ll (+8-8)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i32_to_fp64_elts.ll (+4-4)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i64_to_fp32_elts.ll (+4-4)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i8_to_fp32_elts.ll (+4-4)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i8_to_fp64_elts.ll (+6-6)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i_to_fp_4byte_elts.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_i_to_fp_8byte_elts.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/wide-scalar-shift-by-byte-multiple-legalization.ll (+61-61)
  • (modified) llvm/test/CodeGen/PowerPC/wide-scalar-shift-legalization.ll (+25-25)
diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.td b/llvm/lib/Target/PowerPC/PPCInstrInfo.td
index b4a5e41c0107a3..4daa442d001eb1 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.td
@@ -166,17 +166,17 @@ def PPCany_fcfidus : PatFrags<(ops node:$op),
 
 def PPCstore_scal_int_from_vsr:
    SDNode<"PPCISD::ST_VSR_SCAL_INT", SDT_PPCstore_scal_int_from_vsr,
-           [SDNPHasChain, SDNPMayStore]>;
+           [SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
 def PPCstfiwx : SDNode<"PPCISD::STFIWX", SDT_PPCstfiwx,
-                       [SDNPHasChain, SDNPMayStore]>;
+                       [SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
 def PPClfiwax : SDNode<"PPCISD::LFIWAX", SDT_PPClfiwx,
                        [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
 def PPClfiwzx : SDNode<"PPCISD::LFIWZX", SDT_PPClfiwx,
                        [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
 def PPClxsizx : SDNode<"PPCISD::LXSIZX", SDT_PPCLxsizx,
-                       [SDNPHasChain, SDNPMayLoad]>;
+                       [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
 def PPCstxsix : SDNode<"PPCISD::STXSIX", SDT_PPCstxsix,
-                       [SDNPHasChain, SDNPMayStore]>;
+                       [SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
 def PPCVexts  : SDNode<"PPCISD::VEXTS", SDT_PPCVexts, []>;
 
 // Extract FPSCR (not modeled at the DAG level).
@@ -376,7 +376,7 @@ def PPCatomicCmpSwap_16 :
 def PPClbrx       : SDNode<"PPCISD::LBRX", SDT_PPClbrx,
                            [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
 def PPCstbrx      : SDNode<"PPCISD::STBRX", SDT_PPCstbrx,
-                           [SDNPHasChain, SDNPMayStore]>;
+                           [SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
 def PPCStoreCond  : SDNode<"PPCISD::STORE_COND", SDT_StoreCond,
                            [SDNPHasChain, SDNPMayStore,
                             SDNPMemOperand, SDNPOutGlue]>;
diff --git a/llvm/lib/Target/PowerPC/PPCInstrP10.td b/llvm/lib/Target/PowerPC/PPCInstrP10.td
index c4b8597b1df9ff..2fe94f9462b26c 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrP10.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrP10.td
@@ -105,7 +105,7 @@ def SDT_PPCLXVRZX : SDTypeProfile<1, 2, [
 
 // PPC Specific DAG Nodes.
 def PPClxvrzx : SDNode<"PPCISD::LXVRZX", SDT_PPCLXVRZX,
-                       [SDNPHasChain, SDNPMayLoad]>;
+                       [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
 
 // Top-level class for prefixed instructions.
 class PI<bits<6> pref, bits<6> opcode, dag OOL, dag IOL, string asmstr,
diff --git a/llvm/lib/Target/PowerPC/PPCInstrVSX.td b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
index fe9ab22c576349..8e400bc63b7851 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrVSX.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
@@ -90,11 +90,11 @@ def SDT_PPCxxperm : SDTypeProfile<1, 3, [
 def PPClxvd2x  : SDNode<"PPCISD::LXVD2X", SDT_PPClxvd2x,
                         [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
 def PPCstxvd2x : SDNode<"PPCISD::STXVD2X", SDT_PPCstxvd2x,
-                        [SDNPHasChain, SDNPMayStore]>;
+                        [SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
 def PPCld_vec_be  : SDNode<"PPCISD::LOAD_VEC_BE", SDT_PPCld_vec_be,
                         [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
 def PPCst_vec_be : SDNode<"PPCISD::STORE_VEC_BE", SDT_PPCst_vec_be,
-                        [SDNPHasChain, SDNPMayStore]>;
+                        [SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;
 def PPCxxswapd : SDNode<"PPCISD::XXSWAPD", SDT_PPCxxswapd, [SDNPHasChain]>;
 def PPCmfvsr : SDNode<"PPCISD::MFVSR", SDTUnaryOp, []>;
 def PPCmtvsra : SDNode<"PPCISD::MTVSRA", SDTUnaryOp, []>;
diff --git a/llvm/test/CodeGen/PowerPC/const-nonsplat-array-init.ll b/llvm/test/CodeGen/PowerPC/const-nonsplat-array-init.ll
index 18a61d071cca6c..0a701c22b4621c 100644
--- a/llvm/test/CodeGen/PowerPC/const-nonsplat-array-init.ll
+++ b/llvm/test/CodeGen/PowerPC/const-nonsplat-array-init.ll
@@ -55,9 +55,9 @@ define dso_local void @foo1_int_be_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 1798
 ; P8-LE-NEXT:    ori 4, 4, 1284
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
 ; P8-LE-NEXT:    li 4, 2312
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    sth 4, 20(3)
 ; P8-LE-NEXT:    blr
 ;
@@ -143,9 +143,9 @@ define dso_local void @foo2_int_le_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 2826
 ; P8-LE-NEXT:    ori 4, 4, 2312
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
 ; P8-LE-NEXT:    li 4, 3340
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    sth 4, 20(3)
 ; P8-LE-NEXT:    blr
 ;
@@ -231,9 +231,9 @@ define dso_local void @foo3_int_be_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 1543
 ; P8-LE-NEXT:    ori 4, 4, 1029
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
 ; P8-LE-NEXT:    li 4, 2057
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    sth 4, 20(3)
 ; P8-LE-NEXT:    blr
 ;
@@ -313,9 +313,9 @@ define dso_local void @foo4_int_le_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 2571
 ; P8-LE-NEXT:    ori 4, 4, 2057
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
 ; P8-LE-NEXT:    li 4, 3085
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    sth 4, 20(3)
 ; P8-LE-NEXT:    blr
 ;
@@ -389,8 +389,8 @@ define dso_local void @foo5_int_be_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 1029
 ; P8-LE-NEXT:    ori 4, 4, 1543
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo5_int_be_reuse4B:
@@ -455,8 +455,8 @@ define dso_local void @foo6_int_le_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 2057
 ; P8-LE-NEXT:    ori 4, 4, 2571
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo6_int_le_reuse4B:
@@ -1221,8 +1221,8 @@ define dso_local void @foo15_int_noreuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 1029
 ; P8-LE-NEXT:    ori 4, 4, 1544
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo15_int_noreuse4B:
@@ -1371,8 +1371,8 @@ define dso_local void @foo17_fp_be_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 16673
 ; P8-LE-NEXT:    ori 4, 4, 39322
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo17_fp_be_reuse4B:
@@ -1437,8 +1437,8 @@ define dso_local void @foo18_fp_le_reuse4B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 16675
 ; P8-LE-NEXT:    ori 4, 4, 13107
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo18_fp_le_reuse4B:
@@ -1504,8 +1504,8 @@ define dso_local void @foo19_fp_be_reuse8B(ptr nocapture noundef writeonly %a) l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    li 4, 4105
 ; P8-LE-NEXT:    rldic 4, 4, 50, 1
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    std 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo19_fp_be_reuse8B:
@@ -1649,8 +1649,8 @@ define dso_local void @foo21_fp_noreuse4B(ptr nocapture noundef writeonly %a) lo
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 16268
 ; P8-LE-NEXT:    ori 4, 4, 52430
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo21_fp_noreuse4B:
@@ -1716,8 +1716,8 @@ define dso_local void @foo22_fp_noreuse8B(ptr nocapture noundef writeonly %a) lo
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    li 4, 21503
 ; P8-LE-NEXT:    rotldi 4, 4, 52
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    std 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo22_fp_noreuse8B:
diff --git a/llvm/test/CodeGen/PowerPC/const-splat-array-init.ll b/llvm/test/CodeGen/PowerPC/const-splat-array-init.ll
index 4139a8fbcbb4f1..83acb4fac8a76a 100644
--- a/llvm/test/CodeGen/PowerPC/const-splat-array-init.ll
+++ b/llvm/test/CodeGen/PowerPC/const-splat-array-init.ll
@@ -45,8 +45,8 @@ define dso_local void @foo1(ptr nocapture noundef writeonly %a) local_unnamed_ad
 ; P8-LE-NEXT:    addi 4, 4, .LCPI0_0@toc@l
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    li 4, 3333
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    sth 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo1:
@@ -109,8 +109,8 @@ define dso_local void @foo2(ptr nocapture noundef writeonly %a) local_unnamed_ad
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 3333
 ; P8-LE-NEXT:    ori 4, 4, 3333
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo2:
@@ -182,9 +182,9 @@ define dso_local void @foo3(ptr nocapture noundef writeonly %a) local_unnamed_ad
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 3333
 ; P8-LE-NEXT:    ori 4, 4, 3333
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
 ; P8-LE-NEXT:    li 4, 3333
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    sth 4, 20(3)
 ; P8-LE-NEXT:    blr
 ;
@@ -334,8 +334,8 @@ define dso_local void @foo5(ptr nocapture noundef writeonly %a) local_unnamed_ad
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 5
 ; P8-LE-NEXT:    ori 4, 4, 5653
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo5:
@@ -473,8 +473,8 @@ define dso_local void @foo7(ptr nocapture noundef writeonly %a) local_unnamed_ad
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 508
 ; P8-LE-NEXT:    ori 4, 4, 41045
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    std 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo7:
@@ -539,8 +539,8 @@ define dso_local void @foo8(ptr nocapture noundef writeonly %a) local_unnamed_ad
 ; P8-LE-NEXT:    lxvd2x 0, 0, 4
 ; P8-LE-NEXT:    lis 4, 16469
 ; P8-LE-NEXT:    ori 4, 4, 7864
-; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    stw 4, 16(3)
+; P8-LE-NEXT:    stxvd2x 0, 0, 3
 ; P8-LE-NEXT:    blr
 ;
 ; P9-LE-LABEL: foo8:
diff --git a/llvm/test/CodeGen/PowerPC/extract-and-store.ll b/llvm/test/CodeGen/PowerPC/extract-and-store.ll
index 8bf4013160d8e9..13839a7cd20760 100644
--- a/llvm/test/CodeGen/PowerPC/extract-and-store.ll
+++ b/llvm/test/CodeGen/PowerPC/extract-and-store.ll
@@ -574,13 +574,13 @@ define dso_local void @test_stores_exceed_vec_size(<4 x i32> %a, ptr nocapture %
 ; CHECK-NEXT:    addi r3, r3, .LCPI16_0@toc@l
 ; CHECK-NEXT:    lxvd2x vs0, 0, r3
 ; CHECK-NEXT:    li r3, 16
+; CHECK-NEXT:    stfiwx f1, r5, r3
+; CHECK-NEXT:    li r3, 20
+; CHECK-NEXT:    stxsiwx vs34, r5, r3
 ; CHECK-NEXT:    xxswapd vs35, vs0
 ; CHECK-NEXT:    vperm v3, v2, v2, v3
 ; CHECK-NEXT:    xxswapd vs0, vs35
 ; CHECK-NEXT:    stxvd2x vs0, 0, r5
-; CHECK-NEXT:    stfiwx f1, r5, r3
-; CHECK-NEXT:    li r3, 20
-; CHECK-NEXT:    stxsiwx vs34, r5, r3
 ; CHECK-NEXT:    blr
 ;
 ; CHECK-BE-LABEL: test_stores_exceed_vec_size:
diff --git a/llvm/test/CodeGen/PowerPC/f128-fma.ll b/llvm/test/CodeGen/PowerPC/f128-fma.ll
index d830727e78fbf1..d55697422c7eba 100644
--- a/llvm/test/CodeGen/PowerPC/f128-fma.ll
+++ b/llvm/test/CodeGen/PowerPC/f128-fma.ll
@@ -39,10 +39,10 @@ define void @qpFmadd(ptr nocapture readonly %a, ptr nocapture %b,
 ; CHECK-P8-NEXT:    vmr v3, v31
 ; CHECK-P8-NEXT:    bl __addkf3
 ; CHECK-P8-NEXT:    nop
-; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    li r3, 48
-; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
+; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    lvx v31, r1, r3 # 16-byte Folded Reload
+; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
 ; CHECK-P8-NEXT:    ld r30, 64(r1) # 8-byte Folded Reload
 ; CHECK-P8-NEXT:    addi r1, r1, 80
 ; CHECK-P8-NEXT:    ld r0, 16(r1)
@@ -95,10 +95,10 @@ define void @qpFmadd_02(ptr nocapture readonly %a,
 ; CHECK-P8-NEXT:    vmr v2, v31
 ; CHECK-P8-NEXT:    bl __addkf3
 ; CHECK-P8-NEXT:    nop
-; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    li r3, 48
-; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
+; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    lvx v31, r1, r3 # 16-byte Folded Reload
+; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
 ; CHECK-P8-NEXT:    ld r30, 64(r1) # 8-byte Folded Reload
 ; CHECK-P8-NEXT:    addi r1, r1, 80
 ; CHECK-P8-NEXT:    ld r0, 16(r1)
@@ -214,8 +214,8 @@ define void @qpFnmadd(ptr nocapture readonly %a,
 ; CHECK-P8-NEXT:    stb r4, 63(r1)
 ; CHECK-P8-NEXT:    lxvd2x vs0, 0, r3
 ; CHECK-P8-NEXT:    li r3, 64
-; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
 ; CHECK-P8-NEXT:    lvx v31, r1, r3 # 16-byte Folded Reload
+; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
 ; CHECK-P8-NEXT:    ld r30, 80(r1) # 8-byte Folded Reload
 ; CHECK-P8-NEXT:    addi r1, r1, 96
 ; CHECK-P8-NEXT:    ld r0, 16(r1)
@@ -331,10 +331,10 @@ define void @qpFmsub(ptr nocapture readonly %a,
 ; CHECK-P8-NEXT:    vmr v2, v31
 ; CHECK-P8-NEXT:    bl __subkf3
 ; CHECK-P8-NEXT:    nop
-; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    li r3, 48
-; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
+; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    lvx v31, r1, r3 # 16-byte Folded Reload
+; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
 ; CHECK-P8-NEXT:    ld r30, 64(r1) # 8-byte Folded Reload
 ; CHECK-P8-NEXT:    addi r1, r1, 80
 ; CHECK-P8-NEXT:    ld r0, 16(r1)
@@ -451,8 +451,8 @@ define void @qpFnmsub(ptr nocapture readonly %a,
 ; CHECK-P8-NEXT:    stb r4, 63(r1)
 ; CHECK-P8-NEXT:    lxvd2x vs0, 0, r3
 ; CHECK-P8-NEXT:    li r3, 64
-; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
 ; CHECK-P8-NEXT:    lvx v31, r1, r3 # 16-byte Folded Reload
+; CHECK-P8-NEXT:    stxvd2x vs0, 0, r30
 ; CHECK-P8-NEXT:    ld r30, 80(r1) # 8-byte Folded Reload
 ; CHECK-P8-NEXT:    addi r1, r1, 96
 ; CHECK-P8-NEXT:    ld r0, 16(r1)
diff --git a/llvm/test/CodeGen/PowerPC/f128-passByValue.ll b/llvm/test/CodeGen/PowerPC/f128-passByValue.ll
index 04a7d78d714cc5..1572cc082af3ea 100644
--- a/llvm/test/CodeGen/PowerPC/f128-passByValue.ll
+++ b/llvm/test/CodeGen/PowerPC/f128-passByValue.ll
@@ -576,13 +576,13 @@ define void @mixParam_03(fp128 %f1, ptr nocapture %d1, <4 x i32> %vec1,
 ; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    .cfi_offset v31, -32
 ; CHECK-P8-NEXT:    ld r4, 184(r1)
-; CHECK-P8-NEXT:    li r3, 48
 ; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    xxswapd vs1, v3
+; CHECK-P8-NEXT:    li r3, 48
 ; CHECK-P8-NEXT:    std r30, 64(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    mr r30, r5
-; CHECK-P8-NEXT:    stvx v31, r1, r3 # 16-byte Folded Spill
 ; CHECK-P8-NEXT:    stxvd2x vs0, 0, r9
+; CHECK-P8-NEXT:    stvx v31, r1, r3 # 16-byte Folded Spill
 ; CHECK-P8-NEXT:    mr r3, r10
 ; CHECK-P8-NEXT:    stxvd2x vs1, 0, r4
 ; CHECK-P8-NEXT:    lxvd2x vs0, 0, r9
@@ -639,15 +639,15 @@ define fastcc void @mixParam_03f(fp128 %f1, ptr nocapture %d1, <4 x i32> %vec1,
 ; CHECK-P8-NEXT:    .cfi_offset lr, 16
 ; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    .cfi_offset v31, -32
-; CHECK-P8-NEXT:    li r6, 48
 ; CHECK-P8-NEXT:    xxswapd vs0, v2
 ; CHECK-P8-NEXT:    xxswapd vs1, v3
 ; CHECK-P8-NEXT:    std r30, 64(r1) # 8-byte Folded Spill
+; CHECK-P8-NEXT:    li r6, 48
 ; CHECK-P8-NEXT:    mr r30, r3
 ; CHECK-P8-NEXT:    mr r3, r5
-; CHECK-P8-NEXT:    stvx v31, r1, r6 # 16-byte Folded Spill
 ; CHECK-P8-NEXT:    stxvd2x vs0, 0, r4
 ; CHECK-P8-NEXT:    stxvd2x vs1, 0, r7
+; CHECK-P8-NEXT:    stvx v31, r1, r6 # 16-byte Folded Spill
 ; CHECK-P8-NEXT:    lxvd2x vs0, 0, r4
 ; CHECK-P8-NEXT:    xxswapd v31, vs0
 ; CHECK-P8-NEXT:    bl __floatsikf
diff --git a/llvm/test/CodeGen/PowerPC/pr45301.ll b/llvm/test/CodeGen/PowerPC/pr45301.ll
index bb6252e572a1b9..40054ce73188d7 100644
--- a/llvm/test/CodeGen/PowerPC/pr45301.ll
+++ b/llvm/test/CodeGen/PowerPC/pr45301.ll
@@ -23,9 +23,9 @@ define dso_local void @g(ptr %agg.result) local_unnamed_addr #0 {
 ; CHECK-NEXT:    ld r7, 24(r5)
 ; CHECK-NEXT:    std r7, 24(r3)
 ; CHECK-NEXT:    ld r5, 32(r5)
-; CHECK-NEXT:    std r5, 32(r3)
 ; CHECK-NEXT:    stwbrx r4, 0, r3
 ; CHECK-NEXT:    li r4, 20
+; CHECK-NEXT:    std r5, 32(r3)
 ; CHECK-NEXT:    stwbrx r6, r3, r4
 ; CHECK-NEXT:    addi r1, r1, 112
 ; CHECK-NEXT:    ld r0, 16(r1)
diff --git a/llvm/test/CodeGen/PowerPC/pr47891.ll b/llvm/test/CodeGen/PowerPC/pr47891.ll
index 0949b814a13101..4e41b3ee121550 100644
--- a/llvm/test/CodeGen/PowerPC/pr47891.ll
+++ b/llvm/test/CodeGen/PowerPC/pr47891.ll
@@ -55,9 +55,9 @@ define dso_local void @poly2_lshift1(ptr nocapture %p) local_unnamed_addr #0 {
 ; CHECK-NEXT:    std r6, 56(r3)
 ; CHECK-NEXT:    rotldi r6, r7, 1
 ; CHECK-NEXT:    xxswapd vs0, vs0
+; CHECK-NEXT:    stxvd2x vs0, r3, r4
 ; CHECK-NEXT:    rldimi r6, r5, 1, 0
 ; CHECK-NEXT:    std r6, 64(r3)
-; CHECK-NEXT:    stxvd2x vs0, r3, r4
 ; CHECK-NEXT:    blr
 entry:
   %0 = load i64, ptr %p, align 8
diff --git a/llvm/test/CodeGen/PowerPC/pr59074.ll b/llvm/test/CodeGen/PowerPC/pr59074.ll
index d3ca1139b4fd11..6264b9f22876cc 100644
--- a/llvm/test/CodeGen/PowerPC/pr59074.ll
+++ b/llvm/test/CodeGen/PowerPC/pr59074.ll
@@ -33,13 +33,13 @@ define void @pr59074(ptr %0) {
 ; LE32-NEXT:    li 8, 12
 ; LE32-NEXT:    xxswapd 0, 0
 ; LE32-NEXT:    rlwimi 5, 6, 0, 30, 28
-; LE32-NEXT:    addi 4, 4, -12
-; LE32-NEXT:    rlwinm 9, 4, 29, 28, 29
-; LE32-NEXT:    stxvd2x 0, 0, 5
 ; LE32-NEXT:    stw 7, 44(1)
+; LE32-NEXT:    addi 4, 4, -12
 ; LE32-NEXT:    stw 7, 40(1)
 ; LE32-NEXT:    stw 7, 36(1)
 ; LE32-NEXT:    stw 8, 16(1)
+; LE32-NEXT:    rlwinm 9, 4, 29, 28, 29
+; LE32-NEXT:    stxvd2x 0, 0, 5
 ; LE32-NEXT:    clrlwi 4, 4, 27
 ; LE32-NEXT:    lwzux 5, 9, 6
 ; LE32-NEXT:    lwz 6, 8(9)
diff --git a/llvm/test/CodeGen/PowerPC/swaps-le-1.ll b/llvm/test/CodeGen/PowerPC/swaps-le-1.ll
index e2a61d7060ff2a..f3e34101efa29a 100644
--- a/llvm/test/CodeGen/PowerPC/swaps-le-1.ll
+++ b/llvm/test/CodeGen/PowerPC/swaps-le-1.ll
@@ -1,9 +1,12 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+
 ; RUN: llc -verify-machineinstrs -O3 -mcpu=pwr8 \
-; RUN:   -mtriple=powerpc64le-unknown-linux-gnu < %s | FileCheck  %s
+; RUN:   -mtriple=powerpc64le-unknown-linux-gnu < %s | FileCheck  \
+; RUN:   -check-prefix=CHECK-P8 %s
 
 ; RUN: llc -verify-machineinstrs -O3 -mcpu=pwr8 -disable-ppc-vsx-swap-removal \
 ; RUN:   -mtriple=powerpc64le-unknown-linux-gnu < %s | FileCheck  \
-; RUN:   -check-prefix=NOOPTSWAP %s
+; RUN:   -check-prefix=NOOPTSWAP-P8 %s
 
 ; RUN: llc -O3 -mcpu=pwr9 -mtriple=powerpc64le-unknown-linux-gnu \
 ; RUN:  -verify-machineinstrs -ppc-vsr-nums-as-vr < %s | FileCheck  \
@@ -11,7 +14,7 @@
 
 ; RUN: llc -O3 -mcpu=pwr9 -disable-ppc-vsx-swap-removal -mattr=-power9-vector \
 ; RUN:  -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu < %s \
-; RUN:  | FileCheck  -check-prefix=NOOPTSWAP %s
+; RUN:  | FileCheck  -check-prefix=NOOPTSWAP-P9 %s
 
 ; LH: 2016-11-17
 ;   Updated align attritue from 16 to 8 to keep swap instructions tests.
@@ -41,6 +44,250 @@
 @ca = common global [4096 x i32] zeroinitializer, align 8
 
 define void @foo() {
+; CHECK-P8-LABEL: foo:
+; CHECK-P8:       # %bb.0: # %entry
+; CHECK-P8-NEXT:    li 3, 256
+; CHECK-P8-NEXT:    std 30, -16(1) # 8-byte Folded Spill
+; CHECK-P8-NEXT:    addis 4, 2, .LC0@toc@ha
+; CHECK-P8-NEXT:    addis 5, 2, .LC1@toc@ha
+; CHECK-P8-NEXT:    addis 6, 2, .LC2@toc@ha
+; CHECK-P8-NEXT:    addis 7, 2, .LC3@toc@ha
+; CHECK-P8-NEXT:    li 8, 16
+; CHECK-P8-NEXT:    li 9, 32
+; CHECK-P8-NEXT:    mtctr 3
+; CHECK-P8-NEXT:    ld 4, .LC0@toc@l(4)
+; CHECK-P8-NEXT:    ld 5, .LC1@toc@l(5)
+; CHECK-P8-NEXT:    ld 6, .LC2@toc@l(6)
+; CHECK-P8-NEXT:    ld 7, .LC3@toc@l(7)
+; CHECK-P8-NEXT:    li 3, 0
+; CHECK-P8-NEXT:    li 10, 48
+; CHECK-P8-NEXT:    .p2align 4
+; CHECK-P8-NEXT:  .LBB0_1: # %vector.body
+; CHECK-P8-NEXT:    #
+; CHECK-P8-NEXT:    lxvd2x 34, 4...
[truncated]

; CHECK-NEXT: xxswapd 34, 0
; CHECK-NEXT: lxvd2x 0, 0, 4
; CHECK-NEXT: stxvd2x 0, 0, 3
; CHECK-NEXT: blr
Copy link
Contributor Author

@s-barannikov s-barannikov Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asm before this change:

	addi 4, 1, -32
	lxvd2x 0, 0, 4
	stxvd2x 0, 0, 3
	lxvd2x 0, 0, 4
	xxswapd	34, 0
	blr

Now that stxvd2x gets proper memory operand, machine scheduler is able to hoist the second lxvd2x before the store as they access different memory locations. This made the old CHECK-NOT: lxvd2x to fail, so I just regenerated the check lines.

Copy link
Collaborator

@RolandF77 RolandF77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@s-barannikov s-barannikov merged commit 032014e into llvm:main Nov 15, 2024
8 checks passed
@s-barannikov s-barannikov deleted the tablegen/ppc-mem-operands branch November 15, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants