[RISCV] Fix the worst case for VSHA2MS in SiFive P400/P600 scheduling models #116893

mshockwave · 2024-11-20T00:05:58Z

For each RVV instruction we should have a single WriteRes assignment to the worst case scheduling class. This assignment is usually equal to that of the largest LMUL + smallest SEW. My #114317 accidentally made two of these assignments on WriteVSHA2MSV_WorstCase. This won't affect our MachineScheduler nor most of our llvm-mca use cases (assuming you populate the correct LMUL and SEW), yet it's not ideal either.

This patch fixes this issue by assigning the correct numbers and resource mapping to WriteVSHA2MSV_WorstCase, which is equal to that of largest LMUL + largest SEW (Zvknh's scheduling properties are special). I also added a MCA test to make sure we always pick up the correct worst case numbers for P600's scheduling model.

Original issue was reported by @reidtatge

… models TBA...

llvmbot · 2024-11-20T00:06:31Z

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

Changes

For each RVV instruction we should have a single WriteRes assignment to the worst case scheduling class. This assignment is usually equal to that of the largest LMUL + smallest SEW. My #114317 accidentally made two of these assignments on WriteVSHA2MSV_WorstCase. This won't affect our MachineScheduler nor most of our llvm-mca use cases (assuming you populate the correct LMUL and SEW), yet it's not ideal either.

This patch fixes this issue by assigning the correct numbers and resource mapping to WriteVSHA2MSV_WorstCase, which is equal to that of largest LMUL + largest SEW (Zvknh's scheduling properties are special). I also added a MCA test to make sure we always pick up the correct worst case numbers for P600's scheduling model.

Original issue was reported by @reidtatge

Full diff: https://github.com/llvm/llvm-project/pull/116893.diff

3 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td (+10-2)
(modified) llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td (+9-2)
(modified) llvm/test/tools/llvm-mca/RISCV/SiFiveP600/zvknhb.s (+12-7)

diff --git a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td
index 1af89903e0068c..a86c255f0820ed 100644
--- a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td
+++ b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td
@@ -883,8 +883,16 @@ foreach mx = SchedMxList in {
   let Latency = 3, ReleaseAtCycles = [LMulLat] in {
     defm "" : LMULWriteResMX<"WriteVSHA2CHV", [SiFiveP400VEXQ0], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVSHA2CLV", [SiFiveP400VEXQ0], mx, IsWorstCase>;
-    foreach sew = !listremove(SchedSEWSet<mx>.val, [8, 16]) in
-    defm "" : LMULSEWWriteResMXSEW<"WriteVSHA2MSV", [SiFiveP400VEXQ0], mx, sew, IsWorstCase>;
+    defvar ZvknhSEWs = !listremove(SchedSEWSet<mx>.val, [8, 16]);
+    // Largest SEW is the last element, assuming SchedSEWSet is sorted in ascending
+    // order.
+    defvar LargestZvknhSEW = !foldl(!head(ZvknhSEWs), ZvknhSEWs, last, curr, curr);
+    foreach sew = ZvknhSEWs in {
+      // The worst case for Zvknh[ab] is designated to the largest SEW and LMUL.
+      defvar IsWorstCaseVSHA2MSV = !and(IsWorstCase, !eq(sew, LargestZvknhSEW));
+      defm "" : LMULSEWWriteResMXSEW<"WriteVSHA2MSV", [SiFiveP400VEXQ0], mx, sew,
+                                     IsWorstCaseVSHA2MSV>;
+    }
   }
   // Zvkned
   let Latency = 2, ReleaseAtCycles = [LMulLat] in {
diff --git a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td
index c2d93d4c0a7f0a..0c695c9ef30710 100644
--- a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td
+++ b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td
@@ -1135,9 +1135,16 @@ foreach mx = SchedMxList in {
   let Latency = 3, ReleaseAtCycles = [LMulLat] in {
     defm "" : LMULWriteResMX<"WriteVSHA2CHV", [SiFiveP600VectorCrypto], mx, IsWorstCase>;
     defm "" : LMULWriteResMX<"WriteVSHA2CLV", [SiFiveP600VectorCrypto], mx, IsWorstCase>;
-    foreach sew = !listremove(SchedSEWSet<mx>.val, [8, 16]) in {
+    defvar ZvknhSEWs = !listremove(SchedSEWSet<mx>.val, [8, 16]);
+    // Largest SEW is the last element, assuming SchedSEWSet is sorted in ascending
+    // order.
+    defvar LargestZvknhSEW = !foldl(!head(ZvknhSEWs), ZvknhSEWs, last, curr, curr);
+    foreach sew = ZvknhSEWs in {
+      // The worst case for Zvknh[ab] is designated to the largest SEW and LMUL.
+      defvar IsWorstCaseVSHA2MSV = !and(IsWorstCase, !eq(sew, LargestZvknhSEW));
       let ReleaseAtCycles = [SiFiveP600VSHA2MSCycles<mx, sew>.c] in
-      defm "" : LMULSEWWriteResMXSEW<"WriteVSHA2MSV", [SiFiveP600VectorCrypto], mx, sew, IsWorstCase>;
+      defm "" : LMULSEWWriteResMXSEW<"WriteVSHA2MSV", [SiFiveP600VectorCrypto], mx, sew,
+                                     IsWorstCaseVSHA2MSV>;
     }
   }
   // Zvkned
diff --git a/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/zvknhb.s b/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/zvknhb.s
index 20ac87a724af16..adf780279c8954 100644
--- a/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/zvknhb.s
+++ b/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/zvknhb.s
@@ -1,6 +1,9 @@
 # NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
 # RUN: llvm-mca -mtriple=riscv64 -mcpu=sifive-p670 -iterations=1 < %s | FileCheck %s
 
+# Worst case for vsha2ms should be that of LMUL=8 and SEW=64.
+vsha2ms.vv v4, v8, v12
+
 # SEW is only e32 or e64
 
 vsetvli zero, zero, e32, m1, tu, mu
@@ -44,14 +47,14 @@ vsha2ch.vv v8, v16, v24
 vsha2cl.vv v8, v16, v24
 
 # CHECK:      Iterations:        1
-# CHECK-NEXT: Instructions:      32
-# CHECK-NEXT: Total Cycles:      108
-# CHECK-NEXT: Total uOps:        32
+# CHECK-NEXT: Instructions:      33
+# CHECK-NEXT: Total Cycles:      119
+# CHECK-NEXT: Total uOps:        33
 
 # CHECK:      Dispatch Width:    4
-# CHECK-NEXT: uOps Per Cycle:    0.30
-# CHECK-NEXT: IPC:               0.30
-# CHECK-NEXT: Block RThroughput: 97.0
+# CHECK-NEXT: uOps Per Cycle:    0.28
+# CHECK-NEXT: IPC:               0.28
+# CHECK-NEXT: Block RThroughput: 109.0
 
 # CHECK:      Instruction Info:
 # CHECK-NEXT: [1]: #uOps
@@ -62,6 +65,7 @@ vsha2cl.vv v8, v16, v24
 # CHECK-NEXT: [6]: HasSideEffects (U)
 
 # CHECK:      [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# CHECK-NEXT:  1      3     12.00                       vsha2ms.vv	v4, v8, v12
 # CHECK-NEXT:  1      1     1.00                  U     vsetvli	zero, zero, e32, m1, tu, mu
 # CHECK-NEXT:  1      3     1.00                        vsha2ms.vv	v4, v8, v12
 # CHECK-NEXT:  1      3     1.00                        vsha2ch.vv	v4, v8, v12
@@ -115,10 +119,11 @@ vsha2cl.vv v8, v16, v24
 
 # CHECK:      Resource pressure per iteration:
 # CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8.0]  [8.1]  [9]    [10]   [11]   [12]   [13]   [14]
-# CHECK-NEXT:  -      -      -      -     8.00    -      -      -      -      -      -     97.00   -      -      -      -
+# CHECK-NEXT:  -      -      -      -     8.00    -      -      -      -      -      -     109.00  -      -      -      -
 
 # CHECK:      Resource pressure by instruction:
 # CHECK-NEXT: [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8.0]  [8.1]  [9]    [10]   [11]   [12]   [13]   [14]   Instructions:
+# CHECK-NEXT:  -      -      -      -      -      -      -      -      -      -      -     12.00   -      -      -      -     vsha2ms.vv	v4, v8, v12
 # CHECK-NEXT:  -      -      -      -     1.00    -      -      -      -      -      -      -      -      -      -      -     vsetvli	zero, zero, e32, m1, tu, mu
 # CHECK-NEXT:  -      -      -      -      -      -      -      -      -      -      -     1.00    -      -      -      -     vsha2ms.vv	v4, v8, v12
 # CHECK-NEXT:  -      -      -      -      -      -      -      -      -      -      -     1.00    -      -      -      -     vsha2ch.vv	v4, v8, v12

mshockwave · 2024-11-20T17:10:11Z

We might also want to teach TableGen to warn or even throw an error on cases like this (i.e. duplicate WriteRes assignments)

reidtatge · 2024-11-20T18:22:10Z

Yeah, agreed. I spent just a very few minutes looking at it, and (not being an expert in TableGen backends) it looks like SubtargetEmitter.cpp/findWriteResources is just ignoring the situation, although it does look for aliased duplicates - and aborts with a diagnostic - but doesn't explicitly look for duplicates definitions (around line 925). And CodeGenSchedModels::addProcResource() just quietly returns if its already been seen for a given ProcModel. And I may be missing something else... FWIW :-)

…

-Reid

On Wed, Nov 20, 2024 at 9:10 AM Min-Yih Hsu ***@***.***> wrote: We might also want to teach TableGen to warn or even throw an error on cases like this (i.e. duplicate WriteRes assignments) — Reply to this email directly, view it on GitHub <#116893 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIRIKEYPO32X5DKHLZOLJX32BS7AXAVCNFSM6AAAAABSDKRUNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBZGE2DANZUGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

reidtatge

LGTM

topperc

LGTM

[RISCV] Fix the worst case for VSHA2MS in SiFive P400/P600 scheduling…

7b087f0

… models TBA...

mshockwave requested review from michaelmaitland, topperc, reidtatge and wangpc-pp November 20, 2024 00:05

llvmbot added the backend:RISC-V label Nov 20, 2024

reidtatge approved these changes Nov 21, 2024

View reviewed changes

topperc approved these changes Nov 21, 2024

View reviewed changes

mshockwave merged commit 0165f88 into llvm:main Nov 21, 2024
8 of 10 checks passed

mshockwave deleted the patch/rvv/vsha2ms-worst-case branch November 21, 2024 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Fix the worst case for VSHA2MS in SiFive P400/P600 scheduling models #116893

[RISCV] Fix the worst case for VSHA2MS in SiFive P400/P600 scheduling models #116893

Uh oh!

mshockwave commented Nov 20, 2024

Uh oh!

llvmbot commented Nov 20, 2024

Uh oh!

mshockwave commented Nov 20, 2024

Uh oh!

reidtatge commented Nov 20, 2024 via email

Uh oh!

reidtatge left a comment

Uh oh!

topperc left a comment

Uh oh!

Uh oh!

Uh oh!

[RISCV] Fix the worst case for VSHA2MS in SiFive P400/P600 scheduling models #116893

[RISCV] Fix the worst case for VSHA2MS in SiFive P400/P600 scheduling models #116893

Uh oh!

Conversation

mshockwave commented Nov 20, 2024

Uh oh!

llvmbot commented Nov 20, 2024

Uh oh!

mshockwave commented Nov 20, 2024

Uh oh!

reidtatge commented Nov 20, 2024 via email

Uh oh!

reidtatge left a comment

Choose a reason for hiding this comment

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!