-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AArch64] Fix sched model of Neoverse N2 #106376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FLZ101
commented
Aug 28, 2024
- fix write order of "Load vector reg, immed post-index"
- fix a typo
@llvm/pr-subscribers-backend-aarch64 Author: Franklin (FLZ101) Changes
Full diff: https://github.com/llvm/llvm-project/pull/106376.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td b/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
index a4ac344510de91..8f91caa47f0310 100644
--- a/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
+++ b/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
@@ -858,7 +858,7 @@ def : InstRW<[N2Write_6cyc_1L], (instregex "^LDR[SDQ]l$",
"^LDUR[BHSDQ]i$")>;
// Load vector reg, immed post-index
-def : InstRW<[N2Write_6cyc_1I_1L, WriteI], (instregex "^LDR[BHSDQ]post$")>;
+def : InstRW<[WriteI, N2Write_6cyc_1I_1L], (instregex "^LDR[BHSDQ]post$")>;
// Load vector reg, immed pre-index
def : InstRW<[WriteAdr, N2Write_6cyc_1I_1L], (instregex "^LDR[BHSDQ]pre$")>;
@@ -1119,7 +1119,7 @@ def : InstRW<[N2Write_5cyc_1V], (instregex "^FMLALv", "^FMLSLv")>;
// ASIMD FP round, D-form F32 and Q-form F64
def : InstRW<[N2Write_3cyc_1V0],
(instregex "^FRINT[AIMNPXZ]v2f(32|64)$",
- "^FRINT[32|64)[XZ]v2f(32|64)$")>;
+ "^FRINT(32|64)[XZ]v2f(32|64)$")>;
// ASIMD FP round, D-form F16 and Q-form F32
def : InstRW<[N2Write_4cyc_2V0],
diff --git a/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s b/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s
index 0c6ccc1face972..5ffaf9138d4823 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s
@@ -3298,28 +3298,28 @@ add x0, x27, 1
# CHECK: Iterations: 100
# CHECK-NEXT: Instructions: 1000
-# CHECK-NEXT: Total Cycles: 3004
+# CHECK-NEXT: Total Cycles: 508
# CHECK-NEXT: Total uOps: 2000
# CHECK: Dispatch Width: 10
-# CHECK-NEXT: uOps Per Cycle: 0.67
-# CHECK-NEXT: IPC: 0.33
+# CHECK-NEXT: uOps Per Cycle: 3.94
+# CHECK-NEXT: IPC: 1.97
# CHECK-NEXT: Block RThroughput: 3.8
# CHECK: Timeline view:
-# CHECK-NEXT: 0123456789 0123
-# CHECK-NEXT: Index 0123456789 0123456789
+# CHECK-NEXT: 012
+# CHECK-NEXT: Index 0123456789
-# CHECK: [0,0] DeeeeeeER . . . . . . ldr b1, [x27], #254
-# CHECK-NEXT: [0,1] D======eER. . . . . . add x0, x27, #1
-# CHECK-NEXT: [0,2] D======eeeeeeER. . . . . ldr h1, [x27], #254
-# CHECK-NEXT: [0,3] D============eER . . . . add x0, x27, #1
-# CHECK-NEXT: [0,4] .D===========eeeeeeER . . . ldr s1, [x27], #254
-# CHECK-NEXT: [0,5] .D=================eER . . . add x0, x27, #1
-# CHECK-NEXT: [0,6] .D=================eeeeeeER . . ldr d1, [x27], #254
-# CHECK-NEXT: [0,7] .D=======================eER . . add x0, x27, #1
-# CHECK-NEXT: [0,8] . D======================eeeeeeER. ldr q1, [x27], #254
-# CHECK-NEXT: [0,9] . D============================eER add x0, x27, #1
+# CHECK: [0,0] DeeeeeeER . . ldr b1, [x27], #254
+# CHECK-NEXT: [0,1] D=eE----R . . add x0, x27, #1
+# CHECK-NEXT: [0,2] D=eeeeeeER. . ldr h1, [x27], #254
+# CHECK-NEXT: [0,3] D==eE----R. . add x0, x27, #1
+# CHECK-NEXT: [0,4] .D=eeeeeeER . ldr s1, [x27], #254
+# CHECK-NEXT: [0,5] .D==eE----R . add x0, x27, #1
+# CHECK-NEXT: [0,6] .D==eeeeeeER. ldr d1, [x27], #254
+# CHECK-NEXT: [0,7] .D===eE----R. add x0, x27, #1
+# CHECK-NEXT: [0,8] . D==eeeeeeER ldr q1, [x27], #254
+# CHECK-NEXT: [0,9] . D===eE----R add x0, x27, #1
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
@@ -3329,16 +3329,16 @@ add x0, x27, 1
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 1 1.0 1.0 0.0 ldr b1, [x27], #254
-# CHECK-NEXT: 1. 1 7.0 0.0 0.0 add x0, x27, #1
-# CHECK-NEXT: 2. 1 7.0 0.0 0.0 ldr h1, [x27], #254
-# CHECK-NEXT: 3. 1 13.0 0.0 0.0 add x0, x27, #1
-# CHECK-NEXT: 4. 1 12.0 0.0 0.0 ldr s1, [x27], #254
-# CHECK-NEXT: 5. 1 18.0 0.0 0.0 add x0, x27, #1
-# CHECK-NEXT: 6. 1 18.0 0.0 0.0 ldr d1, [x27], #254
-# CHECK-NEXT: 7. 1 24.0 0.0 0.0 add x0, x27, #1
-# CHECK-NEXT: 8. 1 23.0 0.0 0.0 ldr q1, [x27], #254
-# CHECK-NEXT: 9. 1 29.0 0.0 0.0 add x0, x27, #1
-# CHECK-NEXT: 1 15.2 0.1 0.0 <total>
+# CHECK-NEXT: 1. 1 2.0 0.0 4.0 add x0, x27, #1
+# CHECK-NEXT: 2. 1 2.0 0.0 0.0 ldr h1, [x27], #254
+# CHECK-NEXT: 3. 1 3.0 0.0 4.0 add x0, x27, #1
+# CHECK-NEXT: 4. 1 2.0 0.0 0.0 ldr s1, [x27], #254
+# CHECK-NEXT: 5. 1 3.0 0.0 4.0 add x0, x27, #1
+# CHECK-NEXT: 6. 1 3.0 0.0 0.0 ldr d1, [x27], #254
+# CHECK-NEXT: 7. 1 4.0 0.0 4.0 add x0, x27, #1
+# CHECK-NEXT: 8. 1 3.0 0.0 0.0 ldr q1, [x27], #254
+# CHECK-NEXT: 9. 1 4.0 0.0 4.0 add x0, x27, #1
+# CHECK-NEXT: 1 2.7 0.1 2.0 <total>
# CHECK: [47] Code Region - G48
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I must have missed this one as it is WriteI not WriteAdr
@davemgreen Can this PR be merged? |
Can you rebase? I updated some cyc_'s to c_' to try and keep them consistent between models. Sorry about the conflict. |
* fix write order of "Load vector reg, immed post-index" * fix a typo
The rebase is done. |
Thanks. |
* fix write order of "Load vector reg, immed post-index" * fix a typo