1
1
/* ========================== begin_copyright_notice ============================
2
2
3
- Copyright (C) 2023 Intel Corporation
3
+ Copyright (C) 2023-2024 Intel Corporation
4
4
5
5
SPDX-License-Identifier: MIT
6
6
@@ -10,59 +10,63 @@ SPDX-License-Identifier: MIT
10
10
// GenXClobberChecker
11
11
// ===----------------------------------------------------------------------===//
12
12
//
13
- // Read access to GENX_VOLATILE variable yields vload + a user(most of the time
14
- // rdregion, but can be anything including vstore). During internal
15
- // optimizations the user can be (baled in (and or) collapsed (and or) moved
16
- // away) to a position in which it potentially gets affected by a store to the
17
- // same GENX_VOLATILE variable. This situation must be avoided (ideally - only
18
- // when the high-level program employed by-copy semantics, see below).
13
+ // Read access to GENX_VOLATILE global (a global value having "genx_volatile"
14
+ // attribute) is signified by genx.vload(@GENX_VOLATILE_GLOBAL*) and a
15
+ // user (most of the time rdregion, but can be anything including genx.vstore or
16
+ // even a phi (the case is known after simplifycfg pass merging genx.vloads
17
+ // users)). In VC BE semantics during VISA code generation
18
+ // genx.vload(@GENX_VOLATILE_GLOBAL*) IR value does not constitute any VISA
19
+ // instruction by itself, instead it signifies a register address of an object
20
+ // (simd/vector/matrix) pinned in the register file. The VISA instruction is
21
+ // generated from the genx.vload user (or a broader bale sourcing it). The VISA
22
+ // instruction therefore appears at the program text position of a genx.vload
23
+ // user (or a broader bale sourcing it) and not the position of a genx.vload
24
+ // intrinsic itself. During VC BE or standard LLVM optimizations a user
25
+ // instruction (or a broader bale) can be transformed in a way that results in a
26
+ // position "after" genx.vstore to the same GENX_VOLATILE variable, becoming
27
+ // potentially clobbered by it. This situation must be avoided both in VC BE and
28
+ // standard LLVM optimizations. Although we do control VC BE optimizations
29
+ // codebase the issue is subtle and potentially reappearent. VC BE optimizations
30
+ // use genx::isSafeTo<...>CheckAVLoadKill<...> API to avoid the abovementioned
31
+ // situation during transformations performed. Cases when standard LLVM
32
+ // optimizations break the intended VC BE semantics resulting in clobbering are
33
+ // also known (e.g. mem2reg before allowed users subset for genx.vload was
34
+ // defined (see genx::isAGVLoadForbiddenUser(...) routine and
35
+ // GenXLegalizeGVLoadUses pass)).
19
36
//
20
- // This pass implements a checker/fixup (available under
21
- // -check-gv-clobbering=true option, turned on by default in Debug build)
22
- // introduced late in pipeline. It is used to identify situations when we have
23
- // potentially clobbered the global volatile value.
37
+ // This pass implements the checker (available under -check-gv-clobbering=true
38
+ // option, turned on by default in Debug build) introduced late in pipeline. It
39
+ // is used to identify situations when we have used the potentially clobbered
40
+ // GENX_VOLATILE value.
24
41
//
25
42
// The checker warning about potential clobbering means that some optimization
26
- // pass has overlooked the aspect of vload/vstore semantics and must be fixed to
27
- // take it into account. Current list of affected passes:
28
- //
29
- // RegionCollapsing
30
- // FuncBaling
31
- // IMadLegalization
32
- // FuncGroupBaling
33
- // Depressurizer
34
- // ...
35
- //
36
- // ----------------------------------------------------------------
37
- // TODO/IMPORTANT: presently there's no way to differentiate by-copy vs
38
- // by-reference semantics, so we try to avoid moving vload users "after" vstores
39
- // for all the cases, which results in less efficient code generation. The way
40
- // to differentiate by-copy vs by-reference access must be implemented and
41
- // optimizations restricted only for those use cases. By-reference accesses must
42
- // be allowed for optimization as before to provide with most efficient code
43
- // possible.
44
- // ----------------------------------------------------------------
43
+ // pass has overlooked the aspect of genx.vload/genx.vstore semantics described
44
+ // above and must be fixed to take it into account by utilizing
45
+ // genx::isSafeTo<...>CheckAVLoadKill<...>(...) API.
45
46
//
46
47
// -------------------------------
47
- // Pseudocode example
48
+ // Simplified example, pseudocode:
48
49
// -------------------------------
49
- // GENX_VOLATILE g = VALID_VALUE
50
- // funN() { g = INVALID_VALUE }
50
+ // GENX_VOLATILE g = EXPECTED_VALUE
51
+ // funN() { g = UNEXPECTED_VALUE }
51
52
// fun1() { funN() }
52
53
// kernel () {
53
54
// cpy = g // Copy the value of g.
54
55
// fun1() // Either store down function call changes g
55
- // g = INVALID_VALUE // or store in the same function.
56
- // use(cpy) // cpy == VALID_VALUE ; use should see the copied value,
57
- // // ... including complex control flow cases.
56
+ // g = UNEXPECTED_VALUE // or store in the same function.
57
+ // use(cpy) // cpy == EXPECTED_VALUE ; use should see the copied value,
58
+ // // ... including any control flow cases.
58
59
// }
59
60
// }
60
61
// ===----------------------------------------------------------------------===//
61
62
//
62
- // This pass can be used as a standalone tool (under an opt utility) to check
63
- // the intermediate IR dumps acquired by the usage of -vc-dump-ir-split
64
- // -vc-dump-ir-before-pass='*' -vc-dump-ir-after-pass='*' options and/or
65
- // IGC_ShaderDumpEnable="1" and/or during an interactive debugging session.
63
+ // To instantly identify the optimization pass at which problematic situation
64
+ // occurs this pass can be used as a standalone tool (under an opt utility)
65
+ // by checking intermediate IR dumps acquired with the usage of
66
+ // -vc-dump-ir-split -vc-dump-ir-before-pass='*' -vc-dump-ir-after-pass='*'
67
+ // compiler options and/or IGC_ShaderDumpEnable="1".
68
+ //
69
+ // ===----------------------------------------------------------------------===//
66
70
//
67
71
// How to run the checker on individual IR dump (for individual options see
68
72
// options descriptions below in this file:
0 commit comments