-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AArch64] Add flag setting instructions to scheduling model. #96880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-aarch64 Author: Rin Dobrescu (Rin18) ChangesSome flag setting instructions (such as ANDS, ADDS, CCMN) were missing from the V2 scheduling model. This patch adds them in. Patch is 20.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96880.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td b/llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td
index 7fed8fed90017..9f43db10c8d0f 100644
--- a/llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td
+++ b/llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td
@@ -1108,12 +1108,19 @@ def : InstRW<[V2Write_1cyc_1B_1R], (instrs BL, BLR)>;
// ALU, basic
// ALU, basic, flagset
def : SchedAlias<WriteI, V2Write_1cyc_1I>;
-def : InstRW<[V2Write_1cyc_1F], (instregex "^(ADC|SBC)S[WX]r$")>;
+def : InstRW<[V2Write_1cyc_1F], (instregex "^(ADD|SUB)S[WX]r[ir]$",
+ "^(ADC|SBC)S[WX]r$",
+ "^ANDS[WX]ri$",
+ "^(AND|BIC)S[WX]rr$")>;
def : InstRW<[V2Write_0or1cyc_1I], (instregex "^MOVZ[WX]i$")>;
// ALU, extend and shift
def : SchedAlias<WriteIEReg, V2Write_2cyc_1M>;
+// Conditional compare
+def : InstRW<[V2Write_1cyc_1F],
+ (instregex "^CCMP(W|X)(i|r)", "^CCMN(W|X)(i|r)")>;
+
// Arithmetic, LSL shift, shift <= 4
// Arithmetic, flagset, LSL shift, shift <= 4
// Arithmetic, LSR/ASR/ROR shift or LSL shift > 4
diff --git a/llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-basic-instructions.s b/llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-basic-instructions.s
index 20a38a55c1be1..32ec8247f3301 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-basic-instructions.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-basic-instructions.s
@@ -1404,16 +1404,16 @@ drps
# CHECK-NEXT: 1 1 0.17 sub w4, w20, #546, lsl #12
# CHECK-NEXT: 1 1 0.17 sub sp, sp, #288
# CHECK-NEXT: 1 1 0.17 sub wsp, w19, #16
-# CHECK-NEXT: 1 1 0.17 adds w13, w23, #291, lsl #12
-# CHECK-NEXT: 1 1 0.17 cmn w2, #4095
-# CHECK-NEXT: 1 1 0.17 adds w20, wsp, #0
-# CHECK-NEXT: 1 1 0.17 cmn x3, #1, lsl #12
-# CHECK-NEXT: 1 1 0.17 cmp sp, #20, lsl #12
-# CHECK-NEXT: 1 1 0.17 cmp x30, #4095
-# CHECK-NEXT: 1 1 0.17 subs x4, sp, #3822
-# CHECK-NEXT: 1 1 0.17 cmn w3, #291, lsl #12
-# CHECK-NEXT: 1 1 0.17 cmn wsp, #1365
-# CHECK-NEXT: 1 1 0.17 cmn sp, #1092, lsl #12
+# CHECK-NEXT: 1 1 0.25 adds w13, w23, #291, lsl #12
+# CHECK-NEXT: 1 1 0.25 cmn w2, #4095
+# CHECK-NEXT: 1 1 0.25 adds w20, wsp, #0
+# CHECK-NEXT: 1 1 0.25 cmn x3, #1, lsl #12
+# CHECK-NEXT: 1 1 0.25 cmp sp, #20, lsl #12
+# CHECK-NEXT: 1 1 0.25 cmp x30, #4095
+# CHECK-NEXT: 1 1 0.25 subs x4, sp, #3822
+# CHECK-NEXT: 1 1 0.25 cmn w3, #291, lsl #12
+# CHECK-NEXT: 1 1 0.25 cmn wsp, #1365
+# CHECK-NEXT: 1 1 0.25 cmn sp, #1092, lsl #12
# CHECK-NEXT: 1 1 0.17 mov sp, x30
# CHECK-NEXT: 1 1 0.17 mov wsp, w20
# CHECK-NEXT: 1 1 0.17 mov x11, sp
@@ -1699,30 +1699,30 @@ drps
# CHECK-NEXT: 1 1 0.50 b.ne #4
# CHECK-NEXT: 1 1 0.50 b.ge #1048572
# CHECK-NEXT: 1 1 0.50 b.ge #-4
-# CHECK-NEXT: 1 1 0.17 ccmp w1, #31, #0, eq
-# CHECK-NEXT: 1 1 0.17 ccmp w3, #0, #15, hs
-# CHECK-NEXT: 1 1 0.17 ccmp wzr, #15, #13, hs
-# CHECK-NEXT: 1 1 0.17 ccmp x9, #31, #0, le
-# CHECK-NEXT: 1 1 0.17 ccmp x3, #0, #15, gt
-# CHECK-NEXT: 1 1 0.17 ccmp xzr, #5, #7, ne
-# CHECK-NEXT: 1 1 0.17 ccmn w1, #31, #0, eq
-# CHECK-NEXT: 1 1 0.17 ccmn w3, #0, #15, hs
-# CHECK-NEXT: 1 1 0.17 ccmn wzr, #15, #13, hs
-# CHECK-NEXT: 1 1 0.17 ccmn x9, #31, #0, le
-# CHECK-NEXT: 1 1 0.17 ccmn x3, #0, #15, gt
-# CHECK-NEXT: 1 1 0.17 ccmn xzr, #5, #7, ne
-# CHECK-NEXT: 1 1 0.17 ccmp w1, wzr, #0, eq
-# CHECK-NEXT: 1 1 0.17 ccmp w3, w0, #15, hs
-# CHECK-NEXT: 1 1 0.17 ccmp wzr, w15, #13, hs
-# CHECK-NEXT: 1 1 0.17 ccmp x9, xzr, #0, le
-# CHECK-NEXT: 1 1 0.17 ccmp x3, x0, #15, gt
-# CHECK-NEXT: 1 1 0.17 ccmp xzr, x5, #7, ne
-# CHECK-NEXT: 1 1 0.17 ccmn w1, wzr, #0, eq
-# CHECK-NEXT: 1 1 0.17 ccmn w3, w0, #15, hs
-# CHECK-NEXT: 1 1 0.17 ccmn wzr, w15, #13, hs
-# CHECK-NEXT: 1 1 0.17 ccmn x9, xzr, #0, le
-# CHECK-NEXT: 1 1 0.17 ccmn x3, x0, #15, gt
-# CHECK-NEXT: 1 1 0.17 ccmn xzr, x5, #7, ne
+# CHECK-NEXT: 1 1 0.25 ccmp w1, #31, #0, eq
+# CHECK-NEXT: 1 1 0.25 ccmp w3, #0, #15, hs
+# CHECK-NEXT: 1 1 0.25 ccmp wzr, #15, #13, hs
+# CHECK-NEXT: 1 1 0.25 ccmp x9, #31, #0, le
+# CHECK-NEXT: 1 1 0.25 ccmp x3, #0, #15, gt
+# CHECK-NEXT: 1 1 0.25 ccmp xzr, #5, #7, ne
+# CHECK-NEXT: 1 1 0.25 ccmn w1, #31, #0, eq
+# CHECK-NEXT: 1 1 0.25 ccmn w3, #0, #15, hs
+# CHECK-NEXT: 1 1 0.25 ccmn wzr, #15, #13, hs
+# CHECK-NEXT: 1 1 0.25 ccmn x9, #31, #0, le
+# CHECK-NEXT: 1 1 0.25 ccmn x3, #0, #15, gt
+# CHECK-NEXT: 1 1 0.25 ccmn xzr, #5, #7, ne
+# CHECK-NEXT: 1 1 0.25 ccmp w1, wzr, #0, eq
+# CHECK-NEXT: 1 1 0.25 ccmp w3, w0, #15, hs
+# CHECK-NEXT: 1 1 0.25 ccmp wzr, w15, #13, hs
+# CHECK-NEXT: 1 1 0.25 ccmp x9, xzr, #0, le
+# CHECK-NEXT: 1 1 0.25 ccmp x3, x0, #15, gt
+# CHECK-NEXT: 1 1 0.25 ccmp xzr, x5, #7, ne
+# CHECK-NEXT: 1 1 0.25 ccmn w1, wzr, #0, eq
+# CHECK-NEXT: 1 1 0.25 ccmn w3, w0, #15, hs
+# CHECK-NEXT: 1 1 0.25 ccmn wzr, w15, #13, hs
+# CHECK-NEXT: 1 1 0.25 ccmn x9, xzr, #0, le
+# CHECK-NEXT: 1 1 0.25 ccmn x3, x0, #15, gt
+# CHECK-NEXT: 1 1 0.25 ccmn xzr, x5, #7, ne
# CHECK-NEXT: 1 1 0.17 csel w1, w0, w19, ne
# CHECK-NEXT: 1 1 0.17 csel wzr, w5, w9, eq
# CHECK-NEXT: 1 1 0.17 csel w9, wzr, w30, gt
@@ -2585,7 +2585,7 @@ drps
# CHECK: Resource pressure per iteration:
# CHECK-NEXT: [0.0] [0.1] [1.0] [1.1] [2] [3.0] [3.1] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
-# CHECK-NEXT: 11.00 11.00 33.00 33.00 99.00 165.00 165.00 326.58 181.58 109.58 109.58 91.83 91.83 190.00 146.00 30.00 10.00
+# CHECK-NEXT: 11.00 11.00 33.00 33.00 99.00 165.00 165.00 329.42 184.42 112.42 112.42 86.17 86.17 190.00 146.00 30.00 10.00
# CHECK: Resource pressure by instruction:
# CHECK-NEXT: [0.0] [0.1] [1.0] [1.1] [2] [3.0] [3.1] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Instructions:
@@ -2604,16 +2604,16 @@ drps
# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - sub w4, w20, #546, lsl #12
# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - sub sp, sp, #288
# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - sub wsp, w19, #16
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - adds w13, w23, #291, lsl #12
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - cmn w2, #4095
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - adds w20, wsp, #0
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - cmn x3, #1, lsl #12
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - cmp sp, #20, lsl #12
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - cmp x30, #4095
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - subs x4, sp, #3822
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - cmn w3, #291, lsl #12
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - cmn wsp, #1365
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - cmn sp, #1092, lsl #12
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - adds w13, w23, #291, lsl #12
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - cmn w2, #4095
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - adds w20, wsp, #0
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - cmn x3, #1, lsl #12
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - cmp sp, #20, lsl #12
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - cmp x30, #4095
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - subs x4, sp, #3822
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - cmn w3, #291, lsl #12
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - cmn wsp, #1365
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - cmn sp, #1092, lsl #12
# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - mov sp, x30
# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - mov wsp, w20
# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - mov x11, sp
@@ -2899,30 +2899,30 @@ drps
# CHECK-NEXT: 0.50 0.50 - - - - - - - - - - - - - - - b.ne #4
# CHECK-NEXT: 0.50 0.50 - - - - - - - - - - - - - - - b.ge #1048572
# CHECK-NEXT: 0.50 0.50 - - - - - - - - - - - - - - - b.ge #-4
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp w1, #31, #0, eq
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp w3, #0, #15, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp wzr, #15, #13, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp x9, #31, #0, le
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp x3, #0, #15, gt
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp xzr, #5, #7, ne
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn w1, #31, #0, eq
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn w3, #0, #15, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn wzr, #15, #13, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn x9, #31, #0, le
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn x3, #0, #15, gt
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn xzr, #5, #7, ne
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp w1, wzr, #0, eq
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp w3, w0, #15, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp wzr, w15, #13, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp x9, xzr, #0, le
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp x3, x0, #15, gt
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmp xzr, x5, #7, ne
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn w1, wzr, #0, eq
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn w3, w0, #15, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn wzr, w15, #13, hs
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn x9, xzr, #0, le
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn x3, x0, #15, gt
-# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - ccmn xzr, x5, #7, ne
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp w1, #31, #0, eq
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp w3, #0, #15, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp wzr, #15, #13, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp x9, #31, #0, le
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp x3, #0, #15, gt
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp xzr, #5, #7, ne
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn w1, #31, #0, eq
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn w3, #0, #15, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn wzr, #15, #13, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn x9, #31, #0, le
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn x3, #0, #15, gt
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn xzr, #5, #7, ne
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp w1, wzr, #0, eq
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp w3, w0, #15, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp wzr, w15, #13, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp x9, xzr, #0, le
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp x3, x0, #15, gt
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmp xzr, x5, #7, ne
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn w1, wzr, #0, eq
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn w3, w0, #15, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn wzr, w15, #13, hs
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn x9, xzr, #0, le
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn x3, x0, #15, gt
+# CHECK-NEXT: - - - - - - - 0.25 0.25 0.25 0.25 - - - - - - ccmn xzr, x5, #7, ne
# CHECK-NEXT: - - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - csel w1, w0, w19, ne
# CHECK-NEXT: - - - - - - - 0...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of nit, but I don't see test changes for ADC, SBC, and BIC?
ADC and SBC had existing tests that were not influenced by this change, as both were already defined. It looks like BIC was also defined already so I removed it from the patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, in that case it LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, just have a couple of nits. Also, did you test this for ADD{S}/SUB{S} (immediate) with LSL #12
? Based on the SOG it's unclear to me if this case should have a latency and throughput of 2 and use pipeline M. Otherwise, LGTM.
Some flag setting instructions (such as ANDS, ADDS, CCMN) were missing from the V2 scheduling model. This patch adds them in.