Skip to content

Commit 3af7b78

Browse files
committed
Add integration section
1 parent 0281de3 commit 3af7b78

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

doc/GPUPipeline.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,22 @@ Going up the pipeline, the abstractions needed to express specific ISA semantics
3131

3232
TODO: gpu(x), linalg-to-scf, gpu-map-parallel-loops.
3333

34+
### Integration
35+
There are three major point of integration that affect the way the pipeline is built:
36+
1. Input representation.
37+
2. Memory management.
38+
3. Runtime interfaces.
39+
40+
The primary input for our pipelines is linalg on tesnors with named ops. These are pretty flexible (adding more to the upstream is more-or-less straightforward) and cover a lot of ground.
41+
42+
Memory management has to deal with weight caching, dynamic shapes, input/output handling, etc. Certain decisions on the compiler user side lead to additional complications in the pipeline.
43+
For example, having to deal with 'logical' tenors for OneDNN imposes constraints on constant folding.
44+
45+
The choice of runtime interface defines how much additional logic should reside in the pipeline. For managed devices (such as a GPU) there are two distinct options:
46+
1. The compiler only emits a binary for the target device.
47+
2. The compiler emits a binary and a lauch stub that interacts with an appropriate runtime.
48+
The latter provides more context, and thus, potentially more opportunities for optimization. The former gives more control to the user and simplifies the pipeline.
49+
3450
### The path of least resistance
3551
First milestone for the pipeline creation aims at taking what's working now and putting it together.
3652

0 commit comments

Comments
 (0)