Skip to content

[AArch64] Fix sched model of Neoverse N2 #106376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 18, 2024
Merged

[AArch64] Fix sched model of Neoverse N2 #106376

merged 1 commit into from
Sep 18, 2024

Conversation

FLZ101
Copy link
Contributor

@FLZ101 FLZ101 commented Aug 28, 2024

  • fix write order of "Load vector reg, immed post-index"
  • fix a typo

@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Franklin (FLZ101)

Changes
  • fix write order of "Load vector reg, immed post-index"
  • fix a typo

Full diff: https://github.com/llvm/llvm-project/pull/106376.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td (+2-2)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s (+25-25)
diff --git a/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td b/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
index a4ac344510de91..8f91caa47f0310 100644
--- a/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
+++ b/llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
@@ -858,7 +858,7 @@ def : InstRW<[N2Write_6cyc_1L], (instregex "^LDR[SDQ]l$",
                                            "^LDUR[BHSDQ]i$")>;
 
 // Load vector reg, immed post-index
-def : InstRW<[N2Write_6cyc_1I_1L, WriteI], (instregex "^LDR[BHSDQ]post$")>;
+def : InstRW<[WriteI, N2Write_6cyc_1I_1L], (instregex "^LDR[BHSDQ]post$")>;
 // Load vector reg, immed pre-index
 def : InstRW<[WriteAdr, N2Write_6cyc_1I_1L], (instregex "^LDR[BHSDQ]pre$")>;
 
@@ -1119,7 +1119,7 @@ def : InstRW<[N2Write_5cyc_1V], (instregex "^FMLALv", "^FMLSLv")>;
 // ASIMD FP round, D-form F32 and Q-form F64
 def : InstRW<[N2Write_3cyc_1V0],
              (instregex "^FRINT[AIMNPXZ]v2f(32|64)$",
-                        "^FRINT[32|64)[XZ]v2f(32|64)$")>;
+                        "^FRINT(32|64)[XZ]v2f(32|64)$")>;
 
 // ASIMD FP round, D-form F16 and Q-form F32
 def : InstRW<[N2Write_4cyc_2V0],
diff --git a/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s b/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s
index 0c6ccc1face972..5ffaf9138d4823 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s
@@ -3298,28 +3298,28 @@ add x0, x27, 1
 
 # CHECK:      Iterations:        100
 # CHECK-NEXT: Instructions:      1000
-# CHECK-NEXT: Total Cycles:      3004
+# CHECK-NEXT: Total Cycles:      508
 # CHECK-NEXT: Total uOps:        2000
 
 # CHECK:      Dispatch Width:    10
-# CHECK-NEXT: uOps Per Cycle:    0.67
-# CHECK-NEXT: IPC:               0.33
+# CHECK-NEXT: uOps Per Cycle:    3.94
+# CHECK-NEXT: IPC:               1.97
 # CHECK-NEXT: Block RThroughput: 3.8
 
 # CHECK:      Timeline view:
-# CHECK-NEXT:                     0123456789          0123
-# CHECK-NEXT: Index     0123456789          0123456789
+# CHECK-NEXT:                     012
+# CHECK-NEXT: Index     0123456789
 
-# CHECK:      [0,0]     DeeeeeeER .    .    .    .    .  .   ldr	b1, [x27], #254
-# CHECK-NEXT: [0,1]     D======eER.    .    .    .    .  .   add	x0, x27, #1
-# CHECK-NEXT: [0,2]     D======eeeeeeER.    .    .    .  .   ldr	h1, [x27], #254
-# CHECK-NEXT: [0,3]     D============eER    .    .    .  .   add	x0, x27, #1
-# CHECK-NEXT: [0,4]     .D===========eeeeeeER    .    .  .   ldr	s1, [x27], #254
-# CHECK-NEXT: [0,5]     .D=================eER   .    .  .   add	x0, x27, #1
-# CHECK-NEXT: [0,6]     .D=================eeeeeeER   .  .   ldr	d1, [x27], #254
-# CHECK-NEXT: [0,7]     .D=======================eER  .  .   add	x0, x27, #1
-# CHECK-NEXT: [0,8]     . D======================eeeeeeER.   ldr	q1, [x27], #254
-# CHECK-NEXT: [0,9]     . D============================eER   add	x0, x27, #1
+# CHECK:      [0,0]     DeeeeeeER . .   ldr	b1, [x27], #254
+# CHECK-NEXT: [0,1]     D=eE----R . .   add	x0, x27, #1
+# CHECK-NEXT: [0,2]     D=eeeeeeER. .   ldr	h1, [x27], #254
+# CHECK-NEXT: [0,3]     D==eE----R. .   add	x0, x27, #1
+# CHECK-NEXT: [0,4]     .D=eeeeeeER .   ldr	s1, [x27], #254
+# CHECK-NEXT: [0,5]     .D==eE----R .   add	x0, x27, #1
+# CHECK-NEXT: [0,6]     .D==eeeeeeER.   ldr	d1, [x27], #254
+# CHECK-NEXT: [0,7]     .D===eE----R.   add	x0, x27, #1
+# CHECK-NEXT: [0,8]     . D==eeeeeeER   ldr	q1, [x27], #254
+# CHECK-NEXT: [0,9]     . D===eE----R   add	x0, x27, #1
 
 # CHECK:      Average Wait times (based on the timeline view):
 # CHECK-NEXT: [0]: Executions
@@ -3329,16 +3329,16 @@ add x0, x27, 1
 
 # CHECK:            [0]    [1]    [2]    [3]
 # CHECK-NEXT: 0.     1     1.0    1.0    0.0       ldr	b1, [x27], #254
-# CHECK-NEXT: 1.     1     7.0    0.0    0.0       add	x0, x27, #1
-# CHECK-NEXT: 2.     1     7.0    0.0    0.0       ldr	h1, [x27], #254
-# CHECK-NEXT: 3.     1     13.0   0.0    0.0       add	x0, x27, #1
-# CHECK-NEXT: 4.     1     12.0   0.0    0.0       ldr	s1, [x27], #254
-# CHECK-NEXT: 5.     1     18.0   0.0    0.0       add	x0, x27, #1
-# CHECK-NEXT: 6.     1     18.0   0.0    0.0       ldr	d1, [x27], #254
-# CHECK-NEXT: 7.     1     24.0   0.0    0.0       add	x0, x27, #1
-# CHECK-NEXT: 8.     1     23.0   0.0    0.0       ldr	q1, [x27], #254
-# CHECK-NEXT: 9.     1     29.0   0.0    0.0       add	x0, x27, #1
-# CHECK-NEXT:        1     15.2   0.1    0.0       <total>
+# CHECK-NEXT: 1.     1     2.0    0.0    4.0       add	x0, x27, #1
+# CHECK-NEXT: 2.     1     2.0    0.0    0.0       ldr	h1, [x27], #254
+# CHECK-NEXT: 3.     1     3.0    0.0    4.0       add	x0, x27, #1
+# CHECK-NEXT: 4.     1     2.0    0.0    0.0       ldr	s1, [x27], #254
+# CHECK-NEXT: 5.     1     3.0    0.0    4.0       add	x0, x27, #1
+# CHECK-NEXT: 6.     1     3.0    0.0    0.0       ldr	d1, [x27], #254
+# CHECK-NEXT: 7.     1     4.0    0.0    4.0       add	x0, x27, #1
+# CHECK-NEXT: 8.     1     3.0    0.0    0.0       ldr	q1, [x27], #254
+# CHECK-NEXT: 9.     1     4.0    0.0    4.0       add	x0, x27, #1
+# CHECK-NEXT:        1     2.7    0.1    2.0       <total>
 
 # CHECK:      [47] Code Region - G48
 

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I must have missed this one as it is WriteI not WriteAdr

@FLZ101
Copy link
Contributor Author

FLZ101 commented Aug 29, 2024

@FLZ101
Copy link
Contributor Author

FLZ101 commented Sep 11, 2024

@davemgreen Can this PR be merged?

@davemgreen
Copy link
Collaborator

Can you rebase? I updated some cyc_'s to c_' to try and keep them consistent between models. Sorry about the conflict.

* fix write order of "Load vector reg, immed post-index"
* fix a typo
@FLZ101
Copy link
Contributor Author

FLZ101 commented Sep 17, 2024

Can you rebase? I updated some cyc_'s to c_' to try and keep them consistent between models. Sorry about the conflict.

The rebase is done.

@davemgreen
Copy link
Collaborator

Thanks.

@davemgreen davemgreen merged commit ef34cba into llvm:main Sep 18, 2024
8 checks passed
tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
* fix write order of "Load vector reg, immed post-index"
* fix a typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants