Skip to content

Commit 6193233

Browse files
yugrYury Gribov
andauthored
[AArch64] Fix sched model for TSV110 core. (#82343)
Accumulator operand of MADD instruction can be bypassed from another MUL-like operation. Before this fix bypassing was incorrectly applied to multiplier operand. Co-authored-by: Yury Gribov <[email protected]>
1 parent bcbffd9 commit 6193233

File tree

2 files changed

+86
-3
lines changed

2 files changed

+86
-3
lines changed

llvm/lib/Target/AArch64/AArch64SchedTSV110.td

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -419,10 +419,10 @@ def : InstRW<[TSV110Wr_12cyc_1MDU], (instregex "^(S|U)DIVWr$")>;
419419
def : InstRW<[TSV110Wr_20cyc_1MDU], (instregex "^(S|U)DIVXr$")>;
420420

421421
def TSV110ReadMAW : SchedReadAdvance<2, [TSV110Wr_3cyc_1MDU]>;
422-
def : InstRW<[TSV110Wr_3cyc_1MDU, TSV110ReadMAW], (instrs MADDWrrr, MSUBWrrr)>;
422+
def : InstRW<[TSV110Wr_3cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAW], (instrs MADDWrrr, MSUBWrrr)>;
423423
def TSV110ReadMAQ : SchedReadAdvance<3, [TSV110Wr_4cyc_1MDU]>;
424-
def : InstRW<[TSV110Wr_4cyc_1MDU, TSV110ReadMAQ], (instrs MADDXrrr, MSUBXrrr)>;
425-
def : InstRW<[TSV110Wr_3cyc_1MDU, TSV110ReadMAW], (instregex "(S|U)(MADDL|MSUBL)rrr")>;
424+
def : InstRW<[TSV110Wr_4cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAQ], (instrs MADDXrrr, MSUBXrrr)>;
425+
def : InstRW<[TSV110Wr_3cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAW], (instregex "(S|U)(MADDL|MSUBL)rrr")>;
426426
def : InstRW<[TSV110Wr_4cyc_1MDU], (instregex "^(S|U)MULHrr$")>;
427427

428428

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
2+
# RUN: llvm-mca -mtriple=aarch64 -mcpu=tsv110 --instruction-info=0 --resource-pressure=0 --timeline --iterations=1 < %s | FileCheck %s
3+
4+
# LLVM-MCA-BEGIN madd nobypass
5+
mul x0, x1, x2
6+
add x0, x0, x1
7+
add x0, x0, x1
8+
add x0, x0, x1
9+
# LLVM-MCA-END
10+
11+
# LLVM-MCA-BEGIN madd bypass
12+
mul x0, x1, x2
13+
madd x0, x1, x2, x0
14+
madd x0, x1, x2, x0
15+
madd x0, x0, x0, x0
16+
# LLVM-MCA-END
17+
18+
# CHECK: [0] Code Region - madd nobypass
19+
20+
# CHECK: Iterations: 1
21+
# CHECK-NEXT: Instructions: 4
22+
# CHECK-NEXT: Total Cycles: 10
23+
# CHECK-NEXT: Total uOps: 4
24+
25+
# CHECK: Dispatch Width: 4
26+
# CHECK-NEXT: uOps Per Cycle: 0.40
27+
# CHECK-NEXT: IPC: 0.40
28+
# CHECK-NEXT: Block RThroughput: 1.0
29+
30+
# CHECK: Timeline view:
31+
# CHECK-NEXT: Index 0123456789
32+
33+
# CHECK: [0,0] DeeeeER . mul x0, x1, x2
34+
# CHECK-NEXT: [0,1] D====eER . add x0, x0, x1
35+
# CHECK-NEXT: [0,2] D=====eER. add x0, x0, x1
36+
# CHECK-NEXT: [0,3] D======eER add x0, x0, x1
37+
38+
# CHECK: Average Wait times (based on the timeline view):
39+
# CHECK-NEXT: [0]: Executions
40+
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
41+
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
42+
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
43+
44+
# CHECK: [0] [1] [2] [3]
45+
# CHECK-NEXT: 0. 1 1.0 1.0 0.0 mul x0, x1, x2
46+
# CHECK-NEXT: 1. 1 5.0 0.0 0.0 add x0, x0, x1
47+
# CHECK-NEXT: 2. 1 6.0 0.0 0.0 add x0, x0, x1
48+
# CHECK-NEXT: 3. 1 7.0 0.0 0.0 add x0, x0, x1
49+
# CHECK-NEXT: 1 4.8 0.3 0.0 <total>
50+
51+
# CHECK: [1] Code Region - madd bypass
52+
53+
# CHECK: Iterations: 1
54+
# CHECK-NEXT: Instructions: 4
55+
# CHECK-NEXT: Total Cycles: 13
56+
# CHECK-NEXT: Total uOps: 4
57+
58+
# CHECK: Dispatch Width: 4
59+
# CHECK-NEXT: uOps Per Cycle: 0.31
60+
# CHECK-NEXT: IPC: 0.31
61+
# CHECK-NEXT: Block RThroughput: 4.0
62+
63+
# CHECK: Timeline view:
64+
# CHECK-NEXT: 012
65+
# CHECK-NEXT: Index 0123456789
66+
67+
# CHECK: [0,0] DeeeeER . . mul x0, x1, x2
68+
# CHECK-NEXT: [0,1] D=eeeeER . . madd x0, x1, x2, x0
69+
# CHECK-NEXT: [0,2] D==eeeeER . . madd x0, x1, x2, x0
70+
# CHECK-NEXT: [0,3] D======eeeeER madd x0, x0, x0, x0
71+
72+
# CHECK: Average Wait times (based on the timeline view):
73+
# CHECK-NEXT: [0]: Executions
74+
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
75+
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
76+
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
77+
78+
# CHECK: [0] [1] [2] [3]
79+
# CHECK-NEXT: 0. 1 1.0 1.0 0.0 mul x0, x1, x2
80+
# CHECK-NEXT: 1. 1 2.0 0.0 0.0 madd x0, x1, x2, x0
81+
# CHECK-NEXT: 2. 1 3.0 0.0 0.0 madd x0, x1, x2, x0
82+
# CHECK-NEXT: 3. 1 7.0 0.0 0.0 madd x0, x0, x0, x0
83+
# CHECK-NEXT: 1 3.3 0.3 0.0 <total>

0 commit comments

Comments
 (0)