Skip to content
This repository was archived by the owner on Mar 28, 2020. It is now read-only.

Commit 0f51d6d

Browse files
committed
[llvm-mca] Add tests for XOP and AVX512 instructions that implicitly clear the upper portion of a super-register.
When the destination register of a XOP instruction is an XMM register, bits [255:128] of the corresponding YMM register are cleared. When the destination register of a EVEX encoded instruction is an XMM/YMM register, the upper bits of the corresponding ZMM are cleared. On processors that feature AVX512, a write to an XMM registers always clears the upper portion of the corresponding ZMM register if the instruction is VEX or EVEX encoded. These new tests show some interesting cases which aren't correctly analyzed by llvm-mca. The lack of knowledge related to the implicit update on the super-registers is addressed by D48225. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334945 91177308-0d34-0410-b5e6-96231b3b80d8
1 parent 6e8480a commit 0f51d6d

File tree

5 files changed

+430
-0
lines changed

5 files changed

+430
-0
lines changed
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
2+
# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
3+
4+
vmulps %zmm0, %zmm1, %zmm2
5+
vaddps %xmm1, %xmm1, %xmm2
6+
vmulps %ymm2, %ymm3, %ymm4
7+
vaddps %xmm4, %xmm5, %xmm6
8+
vmulps %xmm6, %xmm3, %xmm4
9+
vaddps %xmm4, %xmm5, %xmm0
10+
11+
# CHECK: Iterations: 100
12+
# CHECK-NEXT: Instructions: 600
13+
# CHECK-NEXT: Total Cycles: 2103
14+
# CHECK-NEXT: Dispatch Width: 4
15+
# CHECK-NEXT: IPC: 0.29
16+
# CHECK-NEXT: Block RThroughput: 3.0
17+
18+
# CHECK: Instruction Info:
19+
# CHECK-NEXT: [1]: #uOps
20+
# CHECK-NEXT: [2]: Latency
21+
# CHECK-NEXT: [3]: RThroughput
22+
# CHECK-NEXT: [4]: MayLoad
23+
# CHECK-NEXT: [5]: MayStore
24+
# CHECK-NEXT: [6]: HasSideEffects
25+
26+
# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
27+
# CHECK-NEXT: 1 5 1.00 vmulps %zmm0, %zmm1, %zmm2
28+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm1, %xmm1, %xmm2
29+
# CHECK-NEXT: 1 5 1.00 vmulps %ymm2, %ymm3, %ymm4
30+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm4, %xmm5, %xmm6
31+
# CHECK-NEXT: 1 5 1.00 vmulps %xmm6, %xmm3, %xmm4
32+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm4, %xmm5, %xmm0
33+
34+
# CHECK: Resources:
35+
# CHECK-NEXT: [0] - SBDivider
36+
# CHECK-NEXT: [1] - SBFPDivider
37+
# CHECK-NEXT: [2] - SBPort0
38+
# CHECK-NEXT: [3] - SBPort1
39+
# CHECK-NEXT: [4] - SBPort4
40+
# CHECK-NEXT: [5] - SBPort5
41+
# CHECK-NEXT: [6.0] - SBPort23
42+
# CHECK-NEXT: [6.1] - SBPort23
43+
44+
# CHECK: Resource pressure per iteration:
45+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
46+
# CHECK-NEXT: - - 3.00 3.00 - - - -
47+
48+
# CHECK: Resource pressure by instruction:
49+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
50+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %zmm0, %zmm1, %zmm2
51+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm1, %xmm1, %xmm2
52+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %ymm2, %ymm3, %ymm4
53+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm4, %xmm5, %xmm6
54+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %xmm6, %xmm3, %xmm4
55+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm4, %xmm5, %xmm0
56+
57+
# CHECK: Timeline view:
58+
# CHECK-NEXT: 0123456789 0123456789
59+
# CHECK-NEXT: Index 0123456789 0123456789 01234
60+
61+
# CHECK: [0,0] DeeeeeER . . . . . . . . vmulps %zmm0, %zmm1, %zmm2
62+
# CHECK-NEXT: [0,1] DeeeE--R . . . . . . . . vaddps %xmm1, %xmm1, %xmm2
63+
# CHECK-NEXT: [0,2] D=====eeeeeER . . . . . . . vmulps %ymm2, %ymm3, %ymm4
64+
# CHECK-NEXT: [0,3] D==========eeeER . . . . . . vaddps %xmm4, %xmm5, %xmm6
65+
# CHECK-NEXT: [0,4] .D============eeeeeER . . . . . vmulps %xmm6, %xmm3, %xmm4
66+
# CHECK-NEXT: [0,5] .D=================eeeER . . . . . vaddps %xmm4, %xmm5, %xmm0
67+
# CHECK-NEXT: [1,0] .D====================eeeeeER . . . . vmulps %zmm0, %zmm1, %zmm2
68+
# CHECK-NEXT: [1,1] .DeeeE----------------------R . . . . vaddps %xmm1, %xmm1, %xmm2
69+
# CHECK-NEXT: [1,2] . D========================eeeeeER . . . vmulps %ymm2, %ymm3, %ymm4
70+
# CHECK-NEXT: [1,3] . D=============================eeeER . . vaddps %xmm4, %xmm5, %xmm6
71+
# CHECK-NEXT: [1,4] . D================================eeeeeER . vmulps %xmm6, %xmm3, %xmm4
72+
# CHECK-NEXT: [1,5] . D=====================================eeeER vaddps %xmm4, %xmm5, %xmm0
73+
74+
# CHECK: Average Wait times (based on the timeline view):
75+
# CHECK-NEXT: [0]: Executions
76+
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
77+
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
78+
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
79+
80+
# CHECK: [0] [1] [2] [3]
81+
# CHECK-NEXT: 0. 2 11.0 0.5 0.0 vmulps %zmm0, %zmm1, %zmm2
82+
# CHECK-NEXT: 1. 2 1.0 1.0 12.0 vaddps %xmm1, %xmm1, %xmm2
83+
# CHECK-NEXT: 2. 2 15.5 0.0 0.0 vmulps %ymm2, %ymm3, %ymm4
84+
# CHECK-NEXT: 3. 2 20.5 0.0 0.0 vaddps %xmm4, %xmm5, %xmm6
85+
# CHECK-NEXT: 4. 2 23.0 0.0 0.0 vmulps %xmm6, %xmm3, %xmm4
86+
# CHECK-NEXT: 5. 2 28.0 0.0 0.0 vaddps %xmm4, %xmm5, %xmm0
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
2+
# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
3+
4+
vmulps %zmm0, %zmm1, %zmm2
5+
vaddps %ymm1, %ymm1, %ymm2
6+
vmulps %zmm2, %zmm3, %zmm4
7+
vaddps %xmm4, %xmm5, %xmm6
8+
vmulps %xmm6, %xmm3, %xmm4
9+
vaddps %xmm4, %xmm5, %xmm0
10+
11+
# CHECK: Iterations: 100
12+
# CHECK-NEXT: Instructions: 600
13+
# CHECK-NEXT: Total Cycles: 2103
14+
# CHECK-NEXT: Dispatch Width: 4
15+
# CHECK-NEXT: IPC: 0.29
16+
# CHECK-NEXT: Block RThroughput: 3.0
17+
18+
# CHECK: Instruction Info:
19+
# CHECK-NEXT: [1]: #uOps
20+
# CHECK-NEXT: [2]: Latency
21+
# CHECK-NEXT: [3]: RThroughput
22+
# CHECK-NEXT: [4]: MayLoad
23+
# CHECK-NEXT: [5]: MayStore
24+
# CHECK-NEXT: [6]: HasSideEffects
25+
26+
# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
27+
# CHECK-NEXT: 1 5 1.00 vmulps %zmm0, %zmm1, %zmm2
28+
# CHECK-NEXT: 1 3 1.00 vaddps %ymm1, %ymm1, %ymm2
29+
# CHECK-NEXT: 1 5 1.00 vmulps %zmm2, %zmm3, %zmm4
30+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm4, %xmm5, %xmm6
31+
# CHECK-NEXT: 1 5 1.00 vmulps %xmm6, %xmm3, %xmm4
32+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm4, %xmm5, %xmm0
33+
34+
# CHECK: Resources:
35+
# CHECK-NEXT: [0] - SBDivider
36+
# CHECK-NEXT: [1] - SBFPDivider
37+
# CHECK-NEXT: [2] - SBPort0
38+
# CHECK-NEXT: [3] - SBPort1
39+
# CHECK-NEXT: [4] - SBPort4
40+
# CHECK-NEXT: [5] - SBPort5
41+
# CHECK-NEXT: [6.0] - SBPort23
42+
# CHECK-NEXT: [6.1] - SBPort23
43+
44+
# CHECK: Resource pressure per iteration:
45+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
46+
# CHECK-NEXT: - - 3.00 3.00 - - - -
47+
48+
# CHECK: Resource pressure by instruction:
49+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
50+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %zmm0, %zmm1, %zmm2
51+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %ymm1, %ymm1, %ymm2
52+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %zmm2, %zmm3, %zmm4
53+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm4, %xmm5, %xmm6
54+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %xmm6, %xmm3, %xmm4
55+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm4, %xmm5, %xmm0
56+
57+
# CHECK: Timeline view:
58+
# CHECK-NEXT: 0123456789 0123456789
59+
# CHECK-NEXT: Index 0123456789 0123456789 01234
60+
61+
# CHECK: [0,0] DeeeeeER . . . . . . . . vmulps %zmm0, %zmm1, %zmm2
62+
# CHECK-NEXT: [0,1] DeeeE--R . . . . . . . . vaddps %ymm1, %ymm1, %ymm2
63+
# CHECK-NEXT: [0,2] D=====eeeeeER . . . . . . . vmulps %zmm2, %zmm3, %zmm4
64+
# CHECK-NEXT: [0,3] D==========eeeER . . . . . . vaddps %xmm4, %xmm5, %xmm6
65+
# CHECK-NEXT: [0,4] .D============eeeeeER . . . . . vmulps %xmm6, %xmm3, %xmm4
66+
# CHECK-NEXT: [0,5] .D=================eeeER . . . . . vaddps %xmm4, %xmm5, %xmm0
67+
# CHECK-NEXT: [1,0] .D====================eeeeeER . . . . vmulps %zmm0, %zmm1, %zmm2
68+
# CHECK-NEXT: [1,1] .DeeeE----------------------R . . . . vaddps %ymm1, %ymm1, %ymm2
69+
# CHECK-NEXT: [1,2] . D========================eeeeeER . . . vmulps %zmm2, %zmm3, %zmm4
70+
# CHECK-NEXT: [1,3] . D=============================eeeER . . vaddps %xmm4, %xmm5, %xmm6
71+
# CHECK-NEXT: [1,4] . D================================eeeeeER . vmulps %xmm6, %xmm3, %xmm4
72+
# CHECK-NEXT: [1,5] . D=====================================eeeER vaddps %xmm4, %xmm5, %xmm0
73+
74+
# CHECK: Average Wait times (based on the timeline view):
75+
# CHECK-NEXT: [0]: Executions
76+
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
77+
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
78+
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
79+
80+
# CHECK: [0] [1] [2] [3]
81+
# CHECK-NEXT: 0. 2 11.0 0.5 0.0 vmulps %zmm0, %zmm1, %zmm2
82+
# CHECK-NEXT: 1. 2 1.0 1.0 12.0 vaddps %ymm1, %ymm1, %ymm2
83+
# CHECK-NEXT: 2. 2 15.5 0.0 0.0 vmulps %zmm2, %zmm3, %zmm4
84+
# CHECK-NEXT: 3. 2 20.5 0.0 0.0 vaddps %xmm4, %xmm5, %xmm6
85+
# CHECK-NEXT: 4. 2 23.0 0.0 0.0 vmulps %xmm6, %xmm3, %xmm4
86+
# CHECK-NEXT: 5. 2 28.0 0.0 0.0 vaddps %xmm4, %xmm5, %xmm0
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
2+
# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
3+
4+
vmulps %zmm0, %zmm1, %zmm2
5+
vaddps %xmm16, %xmm17, %xmm2
6+
vmulps %ymm2, %ymm3, %ymm4
7+
vaddps %xmm4, %xmm18, %xmm6
8+
vmulps %xmm6, %xmm19, %xmm4
9+
vaddps %xmm4, %xmm20, %xmm0
10+
11+
# CHECK: Iterations: 100
12+
# CHECK-NEXT: Instructions: 600
13+
# CHECK-NEXT: Total Cycles: 2103
14+
# CHECK-NEXT: Dispatch Width: 4
15+
# CHECK-NEXT: IPC: 0.29
16+
# CHECK-NEXT: Block RThroughput: 3.0
17+
18+
# CHECK: Instruction Info:
19+
# CHECK-NEXT: [1]: #uOps
20+
# CHECK-NEXT: [2]: Latency
21+
# CHECK-NEXT: [3]: RThroughput
22+
# CHECK-NEXT: [4]: MayLoad
23+
# CHECK-NEXT: [5]: MayStore
24+
# CHECK-NEXT: [6]: HasSideEffects
25+
26+
# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
27+
# CHECK-NEXT: 1 5 1.00 vmulps %zmm0, %zmm1, %zmm2
28+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm16, %xmm17, %xmm2
29+
# CHECK-NEXT: 1 5 1.00 vmulps %ymm2, %ymm3, %ymm4
30+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm4, %xmm18, %xmm6
31+
# CHECK-NEXT: 1 5 1.00 vmulps %xmm6, %xmm19, %xmm4
32+
# CHECK-NEXT: 1 3 1.00 vaddps %xmm4, %xmm20, %xmm0
33+
34+
# CHECK: Resources:
35+
# CHECK-NEXT: [0] - SBDivider
36+
# CHECK-NEXT: [1] - SBFPDivider
37+
# CHECK-NEXT: [2] - SBPort0
38+
# CHECK-NEXT: [3] - SBPort1
39+
# CHECK-NEXT: [4] - SBPort4
40+
# CHECK-NEXT: [5] - SBPort5
41+
# CHECK-NEXT: [6.0] - SBPort23
42+
# CHECK-NEXT: [6.1] - SBPort23
43+
44+
# CHECK: Resource pressure per iteration:
45+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
46+
# CHECK-NEXT: - - 3.00 3.00 - - - -
47+
48+
# CHECK: Resource pressure by instruction:
49+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
50+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %zmm0, %zmm1, %zmm2
51+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm16, %xmm17, %xmm2
52+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %ymm2, %ymm3, %ymm4
53+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm4, %xmm18, %xmm6
54+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %xmm6, %xmm19, %xmm4
55+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %xmm4, %xmm20, %xmm0
56+
57+
# CHECK: Timeline view:
58+
# CHECK-NEXT: 0123456789 0123456789
59+
# CHECK-NEXT: Index 0123456789 0123456789 01234
60+
61+
# CHECK: [0,0] DeeeeeER . . . . . . . . vmulps %zmm0, %zmm1, %zmm2
62+
# CHECK-NEXT: [0,1] DeeeE--R . . . . . . . . vaddps %xmm16, %xmm17, %xmm2
63+
# CHECK-NEXT: [0,2] D=====eeeeeER . . . . . . . vmulps %ymm2, %ymm3, %ymm4
64+
# CHECK-NEXT: [0,3] D==========eeeER . . . . . . vaddps %xmm4, %xmm18, %xmm6
65+
# CHECK-NEXT: [0,4] .D============eeeeeER . . . . . vmulps %xmm6, %xmm19, %xmm4
66+
# CHECK-NEXT: [0,5] .D=================eeeER . . . . . vaddps %xmm4, %xmm20, %xmm0
67+
# CHECK-NEXT: [1,0] .D====================eeeeeER . . . . vmulps %zmm0, %zmm1, %zmm2
68+
# CHECK-NEXT: [1,1] .DeeeE----------------------R . . . . vaddps %xmm16, %xmm17, %xmm2
69+
# CHECK-NEXT: [1,2] . D========================eeeeeER . . . vmulps %ymm2, %ymm3, %ymm4
70+
# CHECK-NEXT: [1,3] . D=============================eeeER . . vaddps %xmm4, %xmm18, %xmm6
71+
# CHECK-NEXT: [1,4] . D================================eeeeeER . vmulps %xmm6, %xmm19, %xmm4
72+
# CHECK-NEXT: [1,5] . D=====================================eeeER vaddps %xmm4, %xmm20, %xmm0
73+
74+
# CHECK: Average Wait times (based on the timeline view):
75+
# CHECK-NEXT: [0]: Executions
76+
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
77+
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
78+
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
79+
80+
# CHECK: [0] [1] [2] [3]
81+
# CHECK-NEXT: 0. 2 11.0 0.5 0.0 vmulps %zmm0, %zmm1, %zmm2
82+
# CHECK-NEXT: 1. 2 1.0 1.0 12.0 vaddps %xmm16, %xmm17, %xmm2
83+
# CHECK-NEXT: 2. 2 15.5 0.0 0.0 vmulps %ymm2, %ymm3, %ymm4
84+
# CHECK-NEXT: 3. 2 20.5 0.0 0.0 vaddps %xmm4, %xmm18, %xmm6
85+
# CHECK-NEXT: 4. 2 23.0 0.0 0.0 vmulps %xmm6, %xmm19, %xmm4
86+
# CHECK-NEXT: 5. 2 28.0 0.0 0.0 vaddps %xmm4, %xmm20, %xmm0
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
2+
# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -timeline -timeline-max-iterations=2 < %s | FileCheck %s
3+
4+
vmulps %ymm0, %ymm1, %ymm2
5+
vfrczpd %xmm1, %xmm2
6+
vmulps %ymm2, %ymm3, %ymm4
7+
vaddps %ymm4, %ymm5, %ymm6
8+
vmulps %ymm6, %ymm3, %ymm4
9+
vaddps %ymm4, %ymm5, %ymm0
10+
11+
# CHECK: Iterations: 100
12+
# CHECK-NEXT: Instructions: 600
13+
# CHECK-NEXT: Total Cycles: 2103
14+
# CHECK-NEXT: Dispatch Width: 4
15+
# CHECK-NEXT: IPC: 0.29
16+
# CHECK-NEXT: Block RThroughput: 3.0
17+
18+
# CHECK: Instruction Info:
19+
# CHECK-NEXT: [1]: #uOps
20+
# CHECK-NEXT: [2]: Latency
21+
# CHECK-NEXT: [3]: RThroughput
22+
# CHECK-NEXT: [4]: MayLoad
23+
# CHECK-NEXT: [5]: MayStore
24+
# CHECK-NEXT: [6]: HasSideEffects
25+
26+
# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
27+
# CHECK-NEXT: 1 5 1.00 vmulps %ymm0, %ymm1, %ymm2
28+
# CHECK-NEXT: 1 3 1.00 vfrczpd %xmm1, %xmm2
29+
# CHECK-NEXT: 1 5 1.00 vmulps %ymm2, %ymm3, %ymm4
30+
# CHECK-NEXT: 1 3 1.00 vaddps %ymm4, %ymm5, %ymm6
31+
# CHECK-NEXT: 1 5 1.00 vmulps %ymm6, %ymm3, %ymm4
32+
# CHECK-NEXT: 1 3 1.00 vaddps %ymm4, %ymm5, %ymm0
33+
34+
# CHECK: Resources:
35+
# CHECK-NEXT: [0] - SBDivider
36+
# CHECK-NEXT: [1] - SBFPDivider
37+
# CHECK-NEXT: [2] - SBPort0
38+
# CHECK-NEXT: [3] - SBPort1
39+
# CHECK-NEXT: [4] - SBPort4
40+
# CHECK-NEXT: [5] - SBPort5
41+
# CHECK-NEXT: [6.0] - SBPort23
42+
# CHECK-NEXT: [6.1] - SBPort23
43+
44+
# CHECK: Resource pressure per iteration:
45+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
46+
# CHECK-NEXT: - - 3.00 3.00 - - - -
47+
48+
# CHECK: Resource pressure by instruction:
49+
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
50+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %ymm0, %ymm1, %ymm2
51+
# CHECK-NEXT: - - - 1.00 - - - - vfrczpd %xmm1, %xmm2
52+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %ymm2, %ymm3, %ymm4
53+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %ymm4, %ymm5, %ymm6
54+
# CHECK-NEXT: - - 1.00 - - - - - vmulps %ymm6, %ymm3, %ymm4
55+
# CHECK-NEXT: - - - 1.00 - - - - vaddps %ymm4, %ymm5, %ymm0
56+
57+
# CHECK: Timeline view:
58+
# CHECK-NEXT: 0123456789 0123456789
59+
# CHECK-NEXT: Index 0123456789 0123456789 01234
60+
61+
# CHECK: [0,0] DeeeeeER . . . . . . . . vmulps %ymm0, %ymm1, %ymm2
62+
# CHECK-NEXT: [0,1] DeeeE--R . . . . . . . . vfrczpd %xmm1, %xmm2
63+
# CHECK-NEXT: [0,2] D=====eeeeeER . . . . . . . vmulps %ymm2, %ymm3, %ymm4
64+
# CHECK-NEXT: [0,3] D==========eeeER . . . . . . vaddps %ymm4, %ymm5, %ymm6
65+
# CHECK-NEXT: [0,4] .D============eeeeeER . . . . . vmulps %ymm6, %ymm3, %ymm4
66+
# CHECK-NEXT: [0,5] .D=================eeeER . . . . . vaddps %ymm4, %ymm5, %ymm0
67+
# CHECK-NEXT: [1,0] .D====================eeeeeER . . . . vmulps %ymm0, %ymm1, %ymm2
68+
# CHECK-NEXT: [1,1] .DeeeE----------------------R . . . . vfrczpd %xmm1, %xmm2
69+
# CHECK-NEXT: [1,2] . D========================eeeeeER . . . vmulps %ymm2, %ymm3, %ymm4
70+
# CHECK-NEXT: [1,3] . D=============================eeeER . . vaddps %ymm4, %ymm5, %ymm6
71+
# CHECK-NEXT: [1,4] . D================================eeeeeER . vmulps %ymm6, %ymm3, %ymm4
72+
# CHECK-NEXT: [1,5] . D=====================================eeeER vaddps %ymm4, %ymm5, %ymm0
73+
74+
# CHECK: Average Wait times (based on the timeline view):
75+
# CHECK-NEXT: [0]: Executions
76+
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
77+
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
78+
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
79+
80+
# CHECK: [0] [1] [2] [3]
81+
# CHECK-NEXT: 0. 2 11.0 0.5 0.0 vmulps %ymm0, %ymm1, %ymm2
82+
# CHECK-NEXT: 1. 2 1.0 1.0 12.0 vfrczpd %xmm1, %xmm2
83+
# CHECK-NEXT: 2. 2 15.5 0.0 0.0 vmulps %ymm2, %ymm3, %ymm4
84+
# CHECK-NEXT: 3. 2 20.5 0.0 0.0 vaddps %ymm4, %ymm5, %ymm6
85+
# CHECK-NEXT: 4. 2 23.0 0.0 0.0 vmulps %ymm6, %ymm3, %ymm4
86+
# CHECK-NEXT: 5. 2 28.0 0.0 0.0 vaddps %ymm4, %ymm5, %ymm0

0 commit comments

Comments
 (0)