Skip to content

Commit 0c63122

Browse files
cxy-1993joker-eph
authored andcommitted
[MLIR] Add stage to side effect
[MLIR] Add stage and effectOnFullRegion to side effect This patch add stage and effectOnFullRegion to side effect for optimization pass to obtain more accurate information. Stage uses numbering to track the side effects's stage of occurrence. EffectOnFullRegion indicates if effect act on every single value of resource. RFC disscussion: https://discourse.llvm.org/t/rfc-add-effect-index-in-memroy-effect/72235 Differential Revision: https://reviews.llvm.org/D156087 Reviewed By: mehdi_amini, Mogball Differential Revision: https://reviews.llvm.org/D156087
1 parent bebb9df commit 0c63122

File tree

14 files changed

+260
-60
lines changed

14 files changed

+260
-60
lines changed

mlir/docs/Rationale/SideEffectsAndSpeculation.md

Lines changed: 88 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,10 @@ Operations with implicit behaviors can be broadly categorized as follows:
4747
`longjmp`, operations that throw exceptions.
4848

4949
Finally, a given operation may have a combination of the above implicit
50-
behaviors.
50+
behaviors. The combination of implicit behaviors during the execution of the
51+
operation may be ordered. We use 'stage' to label the order of implicit
52+
behaviors during the execution of 'op'. Implicit behaviors with a lower stage
53+
number happen earlier than those with a higher stage number.
5154

5255
## Modeling
5356

@@ -76,6 +79,10 @@ When adding a new op, ask:
7679

7780
1. Does it read from or write to the heap or stack? It should probably implement
7881
`MemoryEffectsOpInterface`.
82+
1. Does these side effects ordered? It should probably set the stage of
83+
side effects to make analysis more accurate.
84+
1. Does These side effects act on every single value of resource? It probably
85+
should set the FullEffect on effect.
7986
1. Does it have side effects that must be preserved, like a volatile store or a
8087
syscall? It should probably implement `MemoryEffectsOpInterface` and model
8188
the effect as a read from or write to an abstract `Resource`. Please start an
@@ -91,3 +98,83 @@ When adding a new op, ask:
9198
1. Is your operation free of side effects and can be freely hoisted, introduced
9299
and eliminated? It should probably be marked `Pure`. (TODO: revisit this name
93100
since it has overloaded meanings in C++.)
101+
102+
## Examples
103+
104+
This section describes a few very simple examples that help understand how to
105+
add side effect correctly.
106+
107+
### SIMD compute operation
108+
109+
If we have a SIMD backend dialect with a "simd.abs" operation, which reads all
110+
values from the source memref, calculates their absolute values, and writes them
111+
to the target memref.
112+
113+
```mlir
114+
func.func @abs(%source : memref<10xf32>, %target : memref<10xf32>) {
115+
simd.abs(%source, %target) : memref<10xf32> to memref<10xf32>
116+
return
117+
}
118+
```
119+
120+
The abs operation reads each individual value from the source resource and then
121+
writes these values to each corresponding value in the target resource.
122+
Therefore, we need to specify a read side effect for the source and a write side
123+
effect for the target. The read side effect occurs before the write side effect,
124+
so we need to mark the read stage as earlier than the write stage. Additionally,
125+
we need to indicate that these side effects apply to each individual value in
126+
the resource.
127+
128+
A typical approach is as follows:
129+
``` mlir
130+
def AbsOp : SIMD_Op<"abs", [...] {
131+
...
132+
133+
let arguments = (ins Arg<AnyRankedOrUnrankedMemRef, "the source memref",
134+
[MemReadAt<0, FullEffect>]>:$source,
135+
Arg<AnyRankedOrUnrankedMemRef, "the target memref",
136+
[MemWriteAt<1, FullEffect>]>:$target);
137+
138+
...
139+
}
140+
```
141+
142+
In the above example, we attach the side effect [MemReadAt<0, FullEffect>] to
143+
the source, indicating that the abs operation reads each individual value from
144+
the source during stage 0. Likewise, we attach the side effect
145+
[MemWriteAt<1, FullEffect>] to the target, indicating that the abs operation
146+
writes to each individual value within the target during stage 1 (after reading
147+
from the source).
148+
149+
### Load like operation
150+
151+
Memref.load is a typical load like operation:
152+
```mlir
153+
func.func @foo(%input : memref<10xf32>, %index : index) -> f32 {
154+
%result = memref.load %input[index] : memref<10xf32>
155+
return %result : f32
156+
}
157+
```
158+
159+
The load like operation reads a single value from the input memref and returns
160+
it. Therefore, we need to specify a partial read side effect for the input
161+
memref, indicating that not every single value is used.
162+
163+
A typical approach is as follows:
164+
``` mlir
165+
def LoadOp : MemRef_Op<"load", [...] {
166+
...
167+
168+
let arguments = (ins Arg<AnyMemRef, "the reference to load from",
169+
[MemReadAt<0, PartialEffect>]>:$memref,
170+
Variadic<Index>:$indices,
171+
DefaultValuedOptionalAttr<BoolAttr, "false">:$nontemporal);
172+
173+
...
174+
}
175+
```
176+
177+
In the above example, we attach the side effect [MemReadAt<0, PartialEffect>] to
178+
the source, indicating that the load operation reads parts of values from the
179+
memref during stage 0. Since side effects typically occur at stage 0 and are
180+
partial by default, we can abbreviate it as "[MemRead]".

mlir/include/mlir/Dialect/Bufferization/IR/BufferizationOps.td

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,8 @@ def Bufferization_ToTensorOp : Bufferization_Op<"to_tensor", [
383383
}];
384384

385385
let arguments = (ins Arg<AnyRankedOrUnrankedMemRef,
386-
"the reference to load from", [MemRead]>:$memref,
386+
"the reference to load from",
387+
[MemReadAt<0, FullEffect>]>:$memref,
387388
UnitAttr:$restrict, UnitAttr:$writable);
388389
let results = (outs AnyTensor:$result);
389390

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1228,7 +1228,7 @@ def GPU_AllocOp : GPU_Op<"alloc", [
12281228
let arguments = (ins Variadic<GPU_AsyncToken>:$asyncDependencies,
12291229
Variadic<Index>:$dynamicSizes, Variadic<Index>:$symbolOperands,
12301230
UnitAttr:$hostShared);
1231-
let results = (outs Res<AnyMemRef, "", [MemAlloc]>:$memref,
1231+
let results = (outs Res<AnyMemRef, "", [MemAllocAt<0, FullEffect>]>:$memref,
12321232
Optional<GPU_AsyncToken>:$asyncToken);
12331233

12341234
let extraClassDeclaration = [{
@@ -1268,7 +1268,7 @@ def GPU_DeallocOp : GPU_Op<"dealloc", [GPU_AsyncOpInterface]> {
12681268
}];
12691269

12701270
let arguments = (ins Variadic<GPU_AsyncToken>:$asyncDependencies,
1271-
Arg<AnyMemRef, "", [MemFree]>:$memref);
1271+
Arg<AnyMemRef, "", [MemFreeAt<0, FullEffect>]>:$memref);
12721272
let results = (outs Optional<GPU_AsyncToken>:$asyncToken);
12731273

12741274
let assemblyFormat = [{
@@ -1299,8 +1299,8 @@ def GPU_MemcpyOp : GPU_Op<"memcpy", [GPU_AsyncOpInterface]> {
12991299
}];
13001300

13011301
let arguments = (ins Variadic<GPU_AsyncToken>:$asyncDependencies,
1302-
Arg<AnyMemRef, "", [MemWrite]>:$dst,
1303-
Arg<AnyMemRef, "", [MemRead]>:$src);
1302+
Arg<AnyMemRef, "", [MemWriteAt<0, FullEffect>]>:$dst,
1303+
Arg<AnyMemRef, "", [MemReadAt<0, FullEffect>]>:$src);
13041304
let results = (outs Optional<GPU_AsyncToken>:$asyncToken);
13051305

13061306
let assemblyFormat = [{
@@ -1335,7 +1335,7 @@ def GPU_MemsetOp : GPU_Op<"memset",
13351335
}];
13361336

13371337
let arguments = (ins Variadic<GPU_AsyncToken>:$asyncDependencies,
1338-
Arg<AnyMemRef, "", [MemWrite]>:$dst,
1338+
Arg<AnyMemRef, "", [MemWriteAt<0, FullEffect>]>:$dst,
13391339
Arg<AnyType, "">:$value);
13401340
let results = (outs Optional<GPU_AsyncToken>:$asyncToken);
13411341

@@ -1390,7 +1390,8 @@ def GPU_SubgroupMmaLoadMatrixOp : GPU_Op<"subgroup_mma_load_matrix",
13901390
```
13911391
}];
13921392

1393-
let arguments = (ins Arg<GPU_MMAMemRef, "", [MemRead]>:$srcMemref,
1393+
let arguments = (ins Arg<GPU_MMAMemRef, "",
1394+
[MemReadAt<0, FullEffect>]>:$srcMemref,
13941395
Variadic<Index>:$indices,
13951396
IndexAttr:$leadDimension,
13961397
OptionalAttr<UnitAttr>:$transpose);
@@ -1431,7 +1432,7 @@ def GPU_SubgroupMmaStoreMatrixOp : GPU_Op<"subgroup_mma_store_matrix",
14311432
}];
14321433

14331434
let arguments = (ins Arg<MMAMatrixOf<[SI8, UI8, I32, F16, F32]>>:$src,
1434-
Arg<GPU_MMAMemRef, "",[MemWrite]>:$dstMemref,
1435+
Arg<GPU_MMAMemRef, "",[MemWriteAt<0, FullEffect>]>:$dstMemref,
14351436
Variadic<Index>:$indices,
14361437
IndexAttr:$leadDimension,
14371438
OptionalAttr<UnitAttr>:$transpose);

mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,8 @@ class AllocLikeOp<string mnemonic,
7272
Variadic<Index>:$symbolOperands,
7373
ConfinedAttr<OptionalAttr<I64Attr>,
7474
[IntMinValue<0>]>:$alignment);
75-
let results = (outs Res<AnyMemRef, "", [MemAlloc<resource>]>:$memref);
75+
let results = (outs Res<AnyMemRef, "",
76+
[MemAlloc<resource, 0, FullEffect>]>:$memref);
7677

7778
let builders = [
7879
OpBuilder<(ins "MemRefType":$memrefType,
@@ -276,12 +277,15 @@ def MemRef_ReallocOp : MemRef_Op<"realloc"> {
276277
// memref and allocating the outcoming memref, even though this may not
277278
// physically happen on each execution.
278279

279-
let arguments = (ins Arg<MemRefRankOf<[AnyType], [1]>, "", [MemFree]>:$source,
280+
let arguments = (ins Arg<MemRefRankOf<[AnyType], [1]>, "",
281+
[MemFreeAt<0, FullEffect>]>:$source,
280282
Optional<Index>:$dynamicResultSize,
281283
ConfinedAttr<OptionalAttr<I64Attr>,
282284
[IntMinValue<0>]>:$alignment);
283285

284-
let results = (outs Res<MemRefRankOf<[AnyType], [1]>, "", [MemAlloc<DefaultResource>]>);
286+
let results = (outs Res<MemRefRankOf<[AnyType], [1]>, "",
287+
[MemAlloc<DefaultResource, 1,
288+
FullEffect>]>);
285289

286290
let builders = [
287291
OpBuilder<(ins "MemRefType":$resultType,
@@ -532,9 +536,9 @@ def CopyOp : MemRef_Op<"copy", [CopyOpInterface, SameOperandsElementType,
532536
}];
533537

534538
let arguments = (ins Arg<AnyRankedOrUnrankedMemRef, "the memref to copy from",
535-
[MemRead]>:$source,
539+
[MemReadAt<0, FullEffect>]>:$source,
536540
Arg<AnyRankedOrUnrankedMemRef, "the memref to copy to",
537-
[MemWrite]>:$target);
541+
[MemWriteAt<0, FullEffect>]>:$target);
538542

539543
let assemblyFormat = [{
540544
$source `,` $target attr-dict `:` type($source) `to` type($target)
@@ -564,7 +568,8 @@ def MemRef_DeallocOp : MemRef_Op<"dealloc", [MemRefsNormalizable]> {
564568
```
565569
}];
566570

567-
let arguments = (ins Arg<AnyRankedOrUnrankedMemRef, "", [MemFree]>:$memref);
571+
let arguments = (ins Arg<AnyRankedOrUnrankedMemRef, "",
572+
[MemFreeAt<0, FullEffect>]>:$memref);
568573

569574
let hasFolder = 1;
570575
let assemblyFormat = "$memref attr-dict `:` type($memref)";
@@ -2107,7 +2112,8 @@ def TensorStoreOp : MemRef_Op<"tensor_store",
21072112
}];
21082113

21092114
let arguments = (ins AnyTensor:$tensor, Arg<AnyRankedOrUnrankedMemRef,
2110-
"the reference to store to", [MemWrite]>:$memref);
2115+
"the reference to store to",
2116+
[MemWriteAt<0, FullEffect>]>:$memref);
21112117

21122118
let assemblyFormat = "$tensor `,` $memref attr-dict `:` type($memref)";
21132119
}

mlir/include/mlir/Dialect/NVGPU/IR/NVGPU.td

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -239,7 +239,7 @@ def NVGPU_LdMatrixOp : NVGPU_Op<"ldmatrix", [
239239
```
240240
}];
241241

242-
let arguments = (ins Arg<AnyMemRef, "", [MemRead]>:$srcMemref,
242+
let arguments = (ins Arg<AnyMemRef, "", [MemReadAt<0, FullEffect>]>:$srcMemref,
243243
Variadic<Index>:$indices, BoolAttr:$transpose,
244244
I32Attr:$numTiles);
245245
let results = (outs AnyVector:$res);
@@ -423,9 +423,9 @@ def NVGPU_DeviceAsyncCopyOp : NVGPU_Op<"device_async_copy", [
423423
```
424424
}];
425425
let results = (outs NVGPU_DeviceAsyncToken:$asyncToken);
426-
let arguments = (ins Arg<AnyMemRef, "", [MemWrite]>:$dst,
426+
let arguments = (ins Arg<AnyMemRef, "", [MemWriteAt<0, FullEffect>]>:$dst,
427427
Variadic<Index>:$dstIndices,
428-
Arg<AnyMemRef, "", [MemRead]>:$src,
428+
Arg<AnyMemRef, "", [MemReadAt<0, FullEffect>]>:$src,
429429
Variadic<Index>:$srcIndices,
430430
IndexAttr:$dstElements,
431431
Optional<Index>:$srcElements,
@@ -630,7 +630,7 @@ def NVGPU_TmaAsyncLoadOp : NVGPU_Op<"tma.async.load", []> {
630630

631631
The Op uses `$barrier` mbarrier based completion mechanism.
632632
}];
633-
let arguments = (ins Arg<AnyMemRef, "", [MemWrite]>:$dst,
633+
let arguments = (ins Arg<AnyMemRef, "", [MemWriteAt<0, FullEffect>]>:$dst,
634634
NVGPU_MBarrierGroup:$barriers,
635635
NVGPU_TensorMapDescriptor:$tensorMapDescriptor,
636636
Variadic<Index>:$coordinates,

mlir/include/mlir/Interfaces/SideEffectInterfaceBase.td

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,10 +149,19 @@ class EffectOpInterfaceBase<string name, string baseEffect>
149149
string baseEffectName = baseEffect;
150150
}
151151

152+
153+
class EffectRange <bits<1> val> {
154+
bits<1> Value = val;
155+
}
156+
157+
def FullEffect : EffectRange<1>;
158+
def PartialEffect : EffectRange<0>;
159+
152160
// This class is the general base side effect class. This is used by derived
153161
// effect interfaces to define their effects.
154162
class SideEffect<EffectOpInterfaceBase interface, string effectName,
155-
Resource resourceReference> : OpVariableDecorator {
163+
Resource resourceReference, int effectStage, EffectRange range>
164+
: OpVariableDecorator {
156165
/// The name of the base effects class.
157166
string baseEffectName = interface.baseEffectName;
158167

@@ -167,6 +176,13 @@ class SideEffect<EffectOpInterfaceBase interface, string effectName,
167176

168177
/// The resource that the effect is being applied to.
169178
string resource = resourceReference.name;
179+
180+
/// The stage of side effects, we use it to describe the sequence in which
181+
/// effects occur.
182+
int stage = effectStage;
183+
184+
// Does this side effect act on every single value of resource.
185+
bit effectOnFullRegion = range.Value;
170186
}
171187

172188
// This class is the base used for specifying effects applied to an operation.

0 commit comments

Comments
 (0)