Skip to content

Commit 30913bb

Browse files
petechouigcbot
authored andcommitted
[Autobackout][FuncReg]Revert of change: 473f0a7
Avoid folding pseudo-and/pseudo-or into its 2 source defining instructions in some cases. Do not perform flag opt for the pseudo-and/pseudo-or when its mask option is mismatched with the mask options of its 2-source defining instructions, and the dst of pseudo-and/pseudo-or is global.
1 parent 3cfbd3c commit 30913bb

File tree

1 file changed

+9
-18
lines changed

1 file changed

+9
-18
lines changed

visa/Optimizer.cpp

Lines changed: 9 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3690,7 +3690,7 @@ bool Optimizer::foldPseudoNot(G4_BB *bb, INST_LIST_ITER &iter) {
36903690
}
36913691

36923692
/***
3693-
this function optimizes the following cases:
3693+
this function optmize the following cases:
36943694
36953695
case 1:
36963696
cmp.gt.P0 s0 s1
@@ -3723,7 +3723,7 @@ mov (1) P0 Imm (NoMask)
37233723
smov (8) r[A0, 0] src0 src1 Imm
37243724
37253725
case 5:
3726-
pseudo_not (1) P2 P1
3726+
psuedo_not (1) P2 P1
37273727
and (1) P4 P3 P2
37283728
==>
37293729
and (1) P4 P3 ~P1
@@ -3818,7 +3818,7 @@ void Optimizer::optimizeLogicOperation() {
38183818
merged = foldPseudoAndOr(bb, ii);
38193819
}
38203820

3821-
// translate the pseudo op
3821+
// translate the psuedo op
38223822
if (!merged) {
38233823
expandPseudoLogic(builder, bb, ii);
38243824
}
@@ -3835,9 +3835,7 @@ bool Optimizer::foldPseudoAndOr(G4_BB *bb, INST_LIST_ITER &ii) {
38353835

38363836
// optimization should apply even when the dst of the pseudo-and/pseudo-or is
38373837
// global, since we are just hoisting it up, and WAR/WAW checks should be
3838-
// performed as we search for the src0 and src1 inst. Also need to check if
3839-
// the mask option of the pseudo-and/pseudo-or matches with the options of
3840-
// the defining instructions when dst is global.
3838+
// performed as we search for the src0 and src1 inst.
38413839

38423840
G4_INST *inst = *ii;
38433841
// look for def of srcs
@@ -3854,7 +3852,7 @@ bool Optimizer::foldPseudoAndOr(G4_BB *bb, INST_LIST_ITER &ii) {
38543852
38553853
The new code uses defInstList directly, and aborts if there are more then are
38563854
two definitions. Which means there is more then one instruction writing to
3857-
source. Disadvantage of that is that it is less precise. For example if we
3855+
source. Disadvantage of that is that it is less precisise. For example if we
38583856
are folding in to closest definition then before it was OK, but now will be
38593857
disallowed.
38603858
*/
@@ -3891,13 +3889,13 @@ bool Optimizer::foldPseudoAndOr(G4_BB *bb, INST_LIST_ITER &ii) {
38913889
std::swap(defInstructions[0], defInstructions[1]);
38923890
std::swap(maxSrc1, maxSrc2);
38933891
}
3894-
// Doing backward scan until earliest src to make sure dst of and/or is not
3892+
// Doing backward scan until earlist src to make sure dst of and/or is not
38953893
// being written to or being read
38963894
/*
38973895
handling case like in spmv_csr
3898-
cmp.lt (M1, 1) P15 V40(0,0)<0;1,0> 0x10:w /// $191
3899-
cmp.lt (M1, 1) P16 V110(0,0)<0;1,0> V34(0,0)<0;1,0> /// $192
3900-
and (M1, 1) P16 P16 P15 /// $193
3896+
cmp.lt (M1, 1) P15 V40(0,0)<0;1,0> 0x10:w /// $191 cmp.lt (M1, 1) P16
3897+
V110(0,0)<0;1,0> V34(0,0)<0;1,0> /// $192 and (M1,
3898+
1) P16 P16 P15 /// $193
39013899
*/
39023900
if (chkBwdOutputHazard(defInstructions[1], ii, defInstructions[0])) {
39033901
return false;
@@ -3952,13 +3950,6 @@ bool Optimizer::foldPseudoAndOr(G4_BB *bb, INST_LIST_ITER &ii) {
39523950
return false;
39533951
}
39543952

3955-
// Check if mask options are mismatched between the pseudo-and/pseudo-or and
3956-
// its defining instructions.
3957-
if ((inst->getMaskOption() != src0DefInst->getMaskOption() ||
3958-
inst->getMaskOption() != src1DefInst->getMaskOption()) &&
3959-
fg.globalOpndHT.isOpndGlobal(inst->getDst()))
3960-
return false;
3961-
39623953
// do the case 3 optimization
39633954

39643955
G4_PredState ps =

0 commit comments

Comments
 (0)