Skip to content

Commit f7d9f1d

Browse files
lsatanovigcbot
authored andcommitted
GenXGVClobberChecker comments improved, misc const-correctness-related change to genx.vload-related API.
1 parent 8ca9e82 commit f7d9f1d

File tree

3 files changed

+104
-66
lines changed

3 files changed

+104
-66
lines changed

IGC/VectorCompiler/lib/GenXCodeGen/GenXGVClobberChecker.cpp

Lines changed: 44 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
/*========================== begin_copyright_notice ============================
22
3-
Copyright (C) 2023 Intel Corporation
3+
Copyright (C) 2023-2024 Intel Corporation
44
55
SPDX-License-Identifier: MIT
66
@@ -10,59 +10,63 @@ SPDX-License-Identifier: MIT
1010
// GenXClobberChecker
1111
//===----------------------------------------------------------------------===//
1212
//
13-
// Read access to GENX_VOLATILE variable yields vload + a user(most of the time
14-
// rdregion, but can be anything including vstore). During internal
15-
// optimizations the user can be (baled in (and or) collapsed (and or) moved
16-
// away) to a position in which it potentially gets affected by a store to the
17-
// same GENX_VOLATILE variable. This situation must be avoided (ideally - only
18-
// when the high-level program employed by-copy semantics, see below).
13+
// Read access to GENX_VOLATILE global (a global value having "genx_volatile"
14+
// attribute) is signified by genx.vload(@GENX_VOLATILE_GLOBAL*) and a
15+
// user (most of the time rdregion, but can be anything including genx.vstore or
16+
// even a phi (the case is known after simplifycfg pass merging genx.vloads
17+
// users)). In VC BE semantics during VISA code generation
18+
// genx.vload(@GENX_VOLATILE_GLOBAL*) IR value does not constitute any VISA
19+
// instruction by itself, instead it signifies a register address of an object
20+
// (simd/vector/matrix) pinned in the register file. The VISA instruction is
21+
// generated from the genx.vload user (or a broader bale sourcing it). The VISA
22+
// instruction therefore appears at the program text position of a genx.vload
23+
// user (or a broader bale sourcing it) and not the position of a genx.vload
24+
// intrinsic itself. During VC BE or standard LLVM optimizations a user
25+
// instruction (or a broader bale) can be transformed in a way that results in a
26+
// position "after" genx.vstore to the same GENX_VOLATILE variable, becoming
27+
// potentially clobbered by it. This situation must be avoided both in VC BE and
28+
// standard LLVM optimizations. Although we do control VC BE optimizations
29+
// codebase the issue is subtle and potentially reappearent. VC BE optimizations
30+
// use genx::isSafeTo<...>CheckAVLoadKill<...> API to avoid the abovementioned
31+
// situation during transformations performed. Cases when standard LLVM
32+
// optimizations break the intended VC BE semantics resulting in clobbering are
33+
// also known (e.g. mem2reg before allowed users subset for genx.vload was
34+
// defined (see genx::isAGVLoadForbiddenUser(...) routine and
35+
// GenXLegalizeGVLoadUses pass)).
1936
//
20-
// This pass implements a checker/fixup (available under
21-
// -check-gv-clobbering=true option, turned on by default in Debug build)
22-
// introduced late in pipeline. It is used to identify situations when we have
23-
// potentially clobbered the global volatile value.
37+
// This pass implements the checker (available under -check-gv-clobbering=true
38+
// option, turned on by default in Debug build) introduced late in pipeline. It
39+
// is used to identify situations when we have used the potentially clobbered
40+
// GENX_VOLATILE value.
2441
//
2542
// The checker warning about potential clobbering means that some optimization
26-
// pass has overlooked the aspect of vload/vstore semantics and must be fixed to
27-
// take it into account. Current list of affected passes:
28-
//
29-
// RegionCollapsing
30-
// FuncBaling
31-
// IMadLegalization
32-
// FuncGroupBaling
33-
// Depressurizer
34-
// ...
35-
//
36-
//----------------------------------------------------------------
37-
// TODO/IMPORTANT: presently there's no way to differentiate by-copy vs
38-
// by-reference semantics, so we try to avoid moving vload users "after" vstores
39-
// for all the cases, which results in less efficient code generation. The way
40-
// to differentiate by-copy vs by-reference access must be implemented and
41-
// optimizations restricted only for those use cases. By-reference accesses must
42-
// be allowed for optimization as before to provide with most efficient code
43-
// possible.
44-
//----------------------------------------------------------------
43+
// pass has overlooked the aspect of genx.vload/genx.vstore semantics described
44+
// above and must be fixed to take it into account by utilizing
45+
// genx::isSafeTo<...>CheckAVLoadKill<...>(...) API.
4546
//
4647
//-------------------------------
47-
// Pseudocode example
48+
// Simplified example, pseudocode:
4849
//-------------------------------
49-
// GENX_VOLATILE g = VALID_VALUE
50-
// funN() { g = INVALID_VALUE }
50+
// GENX_VOLATILE g = EXPECTED_VALUE
51+
// funN() { g = UNEXPECTED_VALUE }
5152
// fun1() { funN() }
5253
// kernel () {
5354
// cpy = g // Copy the value of g.
5455
// fun1() // Either store down function call changes g
55-
// g = INVALID_VALUE // or store in the same function.
56-
// use(cpy) // cpy == VALID_VALUE; use should see the copied value,
57-
// // ... including complex control flow cases.
56+
// g = UNEXPECTED_VALUE // or store in the same function.
57+
// use(cpy) // cpy == EXPECTED_VALUE; use should see the copied value,
58+
// // ... including any control flow cases.
5859
// }
5960
// }
6061
//===----------------------------------------------------------------------===//
6162
//
62-
// This pass can be used as a standalone tool (under an opt utility) to check
63-
// the intermediate IR dumps acquired by the usage of -vc-dump-ir-split
64-
// -vc-dump-ir-before-pass='*' -vc-dump-ir-after-pass='*' options and/or
65-
// IGC_ShaderDumpEnable="1" and/or during an interactive debugging session.
63+
// To instantly identify the optimization pass at which problematic situation
64+
// occurs this pass can be used as a standalone tool (under an opt utility)
65+
// by checking intermediate IR dumps acquired with the usage of
66+
// -vc-dump-ir-split -vc-dump-ir-before-pass='*' -vc-dump-ir-after-pass='*'
67+
// compiler options and/or IGC_ShaderDumpEnable="1".
68+
//
69+
//===----------------------------------------------------------------------===//
6670
//
6771
// How to run the checker on individual IR dump (for individual options see
6872
// options descriptions below in this file:

IGC/VectorCompiler/lib/GenXCodeGen/GenXUtil.cpp

Lines changed: 43 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
/*========================== begin_copyright_notice ============================
22
3-
Copyright (C) 2017-2023 Intel Corporation
3+
Copyright (C) 2017-2024 Intel Corporation
44
55
SPDX-License-Identifier: MIT
66
@@ -1776,15 +1776,21 @@ bool genx::isAVLoad(const Instruction *const I) {
17761776
return (LI && LI->isVolatile()) || GenXIntrinsic::isVLoad(I);
17771777
}
17781778

1779-
Value *genx::getAVLoadSrcOrNull(const Instruction *const I,
1780-
const Value *const CmpSrc) {
1779+
const Value *genx::getAVLoadSrcOrNull(const Instruction *const I,
1780+
const Value *const CmpSrc) {
17811781
if (genx::isAVLoad(I))
1782-
if (auto *Src = getBitCastedValue(I->getOperand(0));
1782+
if (const auto *Src = getBitCastedValue(I->getOperand(0));
17831783
!CmpSrc || Src == CmpSrc)
17841784
return Src;
17851785
return nullptr;
17861786
}
17871787

1788+
Value *genx::getAVLoadSrcOrNull(Instruction *const I,
1789+
const Value *const CmpSrc) {
1790+
return const_cast<Value *>(
1791+
getAVLoadSrcOrNull(const_cast<const Instruction *>(I), CmpSrc));
1792+
}
1793+
17881794
bool genx::isAVLoad(const Instruction *I, const Value *const CmpSrc) {
17891795
return getAVLoadSrcOrNull(I, CmpSrc);
17901796
}
@@ -1794,42 +1800,60 @@ bool genx::isAVStore(const Instruction *const I) {
17941800
GenXIntrinsic::isVStore(I);
17951801
}
17961802

1797-
Value *genx::getAVStoreDstOrNull(const Instruction *const I,
1798-
const Value *const CmpDst) {
1803+
const Value *genx::getAVStoreDstOrNull(const Instruction *const I,
1804+
const Value *const CmpDst) {
17991805
if (genx::isAVStore(I)) {
1800-
if (auto *Dst = getBitCastedValue(I->getOperand(1));
1806+
if (const auto *Dst = getBitCastedValue(I->getOperand(1));
18011807
!CmpDst || Dst == CmpDst)
18021808
return Dst;
18031809
}
18041810
return nullptr;
18051811
}
18061812

1813+
Value *genx::getAVStoreDstOrNull(Instruction *const I,
1814+
const Value *const CmpDst) {
1815+
return const_cast<Value *>(
1816+
getAVStoreDstOrNull(const_cast<const Instruction *>(I), CmpDst));
1817+
}
1818+
18071819
bool genx::isAVStore(const Instruction *const I, const Value *const CmpDst) {
18081820
return getAVStoreDstOrNull(I, CmpDst);
18091821
}
18101822

1811-
Value *genx::getAGVLoadSrcOrNull(const Instruction *const I,
1812-
const Value *const CmpGvSrc) {
1813-
auto *Src = getAVLoadSrcOrNull(I, CmpGvSrc);
1823+
const Value *genx::getAGVLoadSrcOrNull(const Instruction *const I,
1824+
const Value *const CmpGvSrc) {
1825+
const auto *Src = getAVLoadSrcOrNull(I, CmpGvSrc);
18141826
if (!Src)
18151827
return nullptr;
1816-
auto *GV = vc::getUnderlyingGlobalVariable(Src);
1828+
const auto *GV = vc::getUnderlyingGlobalVariable(Src);
18171829
return GV && GV->hasAttribute(genx::FunctionMD::GenXVolatile) ? GV : nullptr;
18181830
}
18191831

1820-
bool genx::isAGVLoad(const Instruction *I, const Value *const CmpGvSrc) {
1832+
Value *genx::getAGVLoadSrcOrNull(Instruction *const I,
1833+
const Value *const CmpGvSrc) {
1834+
return const_cast<Value *>(
1835+
getAGVLoadSrcOrNull(const_cast<const Instruction *>(I), CmpGvSrc));
1836+
}
1837+
1838+
bool genx::isAGVLoad(const Instruction *const I, const Value *const CmpGvSrc) {
18211839
return genx::getAGVLoadSrcOrNull(I, CmpGvSrc);
18221840
}
18231841

1824-
Value *genx::getAGVStoreDstOrNull(const Instruction *const I,
1825-
const Value *const CmpDst) {
1826-
auto *Dst = getAVStoreDstOrNull(I, CmpDst);
1842+
const Value *genx::getAGVStoreDstOrNull(const Instruction *const I,
1843+
const Value *const CmpDst) {
1844+
const auto *Dst = getAVStoreDstOrNull(I, CmpDst);
18271845
if (!Dst)
18281846
return nullptr;
1829-
auto *GV = vc::getUnderlyingGlobalVariable(Dst);
1847+
const auto *GV = vc::getUnderlyingGlobalVariable(Dst);
18301848
return GV && GV->hasAttribute(genx::FunctionMD::GenXVolatile) ? GV : nullptr;
18311849
}
18321850

1851+
Value *genx::getAGVStoreDstOrNull(Instruction *const I,
1852+
const Value *const CmpGvDst) {
1853+
return const_cast<Value *>(
1854+
getAGVStoreDstOrNull(const_cast<const Instruction *>(I), CmpGvDst));
1855+
}
1856+
18331857
bool genx::isAGVStore(const Instruction *const I, const Value *const CmpGvDst) {
18341858
return genx::getAGVStoreDstOrNull(I, CmpGvDst);
18351859
}
@@ -2544,8 +2568,9 @@ bool genx::isSafeToSink_CheckAVLoadKill(const Instruction *const I,
25442568
IGC_ASSERT(I && To && Baling);
25452569
Bale Bale;
25462570
Baling->buildBale(
2547-
const_cast<Instruction *>(I), // TBD: buildBale API is not const-correct
2548-
// with regard to Instruction* argument.
2571+
const_cast<Instruction *>(
2572+
I), // The baling API is mostly used in non-const context, so it has
2573+
// no const-ok overloads.
25492574
&Bale); // TBD!: recheck IncludeAddr=false here, we may want to take
25502575
// address calculation into consideration when sinking, since
25512576
// we'll most likely move the bale along with address calculation

IGC/VectorCompiler/lib/GenXCodeGen/GenXUtil.h

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
/*========================== begin_copyright_notice ============================
22
3-
Copyright (C) 2020-2023 Intel Corporation
3+
Copyright (C) 2020-2024 Intel Corporation
44
55
SPDX-License-Identifier: MIT
66
@@ -187,18 +187,27 @@ bool isGlobalStore(StoreInst *ST);
187187
bool isGlobalLoad(Instruction *I);
188188
bool isGlobalLoad(LoadInst* LI);
189189

190-
Value *getAVLoadSrcOrNull(const Instruction *const I,
190+
const Value *getAVLoadSrcOrNull(const Instruction *const I,
191+
const Value *const CmpSrc = nullptr);
192+
Value *getAVLoadSrcOrNull(Instruction *const I,
191193
const Value *const CmpSrc = nullptr);
192-
Value *getAGVLoadSrcOrNull(const Instruction *const I,
194+
const Value *getAGVLoadSrcOrNull(const Instruction *const I,
195+
const Value *const CmpSrc = nullptr);
196+
Value *getAGVLoadSrcOrNull(Instruction *const I,
193197
const Value *const CmpSrc = nullptr);
194198
bool isAVLoad(const Instruction *const I);
195-
bool isAVLoad(const Instruction *I, const Value *const CmpSrc);
196-
bool isAGVLoad(const Instruction *I, const Value *const CmpGvSrc = nullptr);
199+
bool isAVLoad(const Instruction *const I, const Value *const CmpSrc);
200+
bool isAGVLoad(const Instruction *const I,
201+
const Value *const CmpGvSrc = nullptr);
197202

198-
Value *getAVStoreDstOrNull(const Instruction *const I,
203+
const Value *getAVStoreDstOrNull(const Instruction *const I,
204+
const Value *const CmpDst = nullptr);
205+
Value *getAVStoreDstOrNull(Instruction *const I,
199206
const Value *const CmpDst = nullptr);
200-
Value *getAGVStoreDstOrNull(const Instruction *const I,
201-
const Value *const CmpDst = nullptr);
207+
const Value *getAGVStoreDstOrNull(const Instruction *const I,
208+
const Value *const CmpGvDst = nullptr);
209+
Value *getAGVStoreDstOrNull(Instruction *const I,
210+
const Value *const CmpGvDst = nullptr);
202211
bool isAVStore(const Instruction *const I);
203212
bool isAVStore(const Instruction *const I, const Value *const CmpDst);
204213
bool isAGVStore(const Instruction *const I,

0 commit comments

Comments
 (0)