Skip to content

Commit db973ce

Browse files
broxigarchenarsenm
andauthored
[AMDGPU][True16][CodeGen] True16 Add OpSel when optimizing exec mask (#128928)
True16 Add OpSel when optimizing exec mask True16 VOPCX have the opsel argument. Add it when we create these instructions in SIOptimizeExecMasking. --------- Co-authored-by: Matt Arsenault <[email protected]>
1 parent 71389e5 commit db973ce

File tree

2 files changed

+66
-0
lines changed

2 files changed

+66
-0
lines changed

llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -632,6 +632,8 @@ bool SIOptimizeExecMasking::optimizeVCMPSaveExecSequence(
632632

633633
TryAddImmediateValueFromNamedOperand(AMDGPU::OpName::clamp);
634634

635+
TryAddImmediateValueFromNamedOperand(AMDGPU::OpName::op_sel);
636+
635637
// The kill flags may no longer be correct.
636638
if (Src0->isReg())
637639
MRI->clearKillFlags(Src0->getReg());
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 2
2+
# RUN: llc -march=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize64 -run-pass=si-optimize-exec-masking -o - %s | FileCheck %s
3+
4+
---
5+
name: int
6+
tracksRegLiveness: true
7+
body: |
8+
; CHECK-LABEL: name: int
9+
; CHECK: bb.0:
10+
; CHECK-NEXT: successors: %bb.1(0x80000000)
11+
; CHECK-NEXT: liveins: $vgpr20
12+
; CHECK-NEXT: {{ $}}
13+
; CHECK-NEXT: $sgpr0_sgpr1 = S_MOV_B64 $exec
14+
; CHECK-NEXT: V_CMPX_LT_I16_t16_nosdst_e64 0, 15, 0, $vgpr20_lo16, 0, implicit-def $exec, implicit $exec
15+
; CHECK-NEXT: renamable $sgpr0_sgpr1 = S_XOR_B64 $exec, killed renamable $sgpr0_sgpr1, implicit-def dead $scc
16+
; CHECK-NEXT: S_CBRANCH_EXECZ %bb.1, implicit $exec
17+
; CHECK-NEXT: S_BRANCH %bb.1
18+
; CHECK-NEXT: {{ $}}
19+
; CHECK-NEXT: bb.1:
20+
; CHECK-NEXT: S_ENDPGM 0
21+
bb.1:
22+
liveins: $vgpr20
23+
$vcc = V_CMP_LT_I16_t16_e64 0, 15, 0, $vgpr20_lo16, 0, implicit $exec
24+
renamable $sgpr0_sgpr1 = COPY $exec, implicit-def $exec
25+
renamable $sgpr2_sgpr3 = S_AND_B64 renamable $sgpr0_sgpr1, killed $vcc, implicit-def dead $scc
26+
renamable $sgpr0_sgpr1 = S_XOR_B64 renamable $sgpr2_sgpr3, killed renamable $sgpr0_sgpr1, implicit-def dead $scc
27+
$exec = S_MOV_B64_term killed renamable $sgpr2_sgpr3
28+
S_CBRANCH_EXECZ %bb.2, implicit $exec
29+
S_BRANCH %bb.2
30+
31+
bb.2:
32+
S_ENDPGM 0
33+
...
34+
35+
---
36+
name: float
37+
tracksRegLiveness: true
38+
body: |
39+
; CHECK-LABEL: name: float
40+
; CHECK: bb.0:
41+
; CHECK-NEXT: successors: %bb.1(0x80000000)
42+
; CHECK-NEXT: liveins: $vgpr20
43+
; CHECK-NEXT: {{ $}}
44+
; CHECK-NEXT: $sgpr0_sgpr1 = S_MOV_B64 $exec
45+
; CHECK-NEXT: V_CMPX_LT_F16_t16_nosdst_e64 0, 15, 0, $vgpr20_lo16, 1, 0, implicit-def $exec, implicit $mode, implicit $exec
46+
; CHECK-NEXT: renamable $sgpr0_sgpr1 = S_XOR_B64 $exec, killed renamable $sgpr0_sgpr1, implicit-def dead $scc
47+
; CHECK-NEXT: S_CBRANCH_EXECZ %bb.1, implicit $exec
48+
; CHECK-NEXT: S_BRANCH %bb.1
49+
; CHECK-NEXT: {{ $}}
50+
; CHECK-NEXT: bb.1:
51+
; CHECK-NEXT: S_ENDPGM 0
52+
bb.1:
53+
liveins: $vgpr20
54+
$vcc = V_CMP_LT_F16_t16_e64 0, 15, 0, $vgpr20_lo16, 1, 0, implicit $exec, implicit $mode
55+
renamable $sgpr0_sgpr1 = COPY $exec, implicit-def $exec
56+
renamable $sgpr2_sgpr3 = S_AND_B64 renamable $sgpr0_sgpr1, killed $vcc, implicit-def dead $scc
57+
renamable $sgpr0_sgpr1 = S_XOR_B64 renamable $sgpr2_sgpr3, killed renamable $sgpr0_sgpr1, implicit-def dead $scc
58+
$exec = S_MOV_B64_term killed renamable $sgpr2_sgpr3
59+
S_CBRANCH_EXECZ %bb.2, implicit $exec
60+
S_BRANCH %bb.2
61+
62+
bb.2:
63+
S_ENDPGM 0
64+
...

0 commit comments

Comments
 (0)