Skip to content

Commit 2fa153e

Browse files
committed
[SYCL][DOC] Update SPV_INTEL_joint_matrix
The PR adds checked load/store and construct instructions Signed-off-by: Sidorov, Dmitry <[email protected]>
1 parent 774f2b9 commit 2fa153e

File tree

1 file changed

+210
-31
lines changed

1 file changed

+210
-31
lines changed

sycl/doc/design/spirv-extensions/SPV_INTEL_joint_matrix.asciidoc

Lines changed: 210 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,15 @@
1010
:bf16_capability_token: 6437
1111
:capability_prefetch_name: CooperativeMatrixPrefetchINTEL
1212
:capability_prefetch_token: 6411
13+
:capability_checked_name: CooperativeMatrixCheckedInstructionsINTEL
14+
:capability_checked_token: 6192
1315
:OpCooperativeMatrixGetElementCoordINTEL_token: 6440
1416
:OpCooperativeMatrixApplyFunctionINTEL_token: 6448
1517
:OpCooperativeMatrixPrefetchINTEL_token: 6449
18+
:OpCooperativeMatrixLoadCheckedINTEL_token: 6193
19+
:OpCooperativeMatrixStoreCheckedINTEL_token: 6194
20+
:OpCooperativeMatrixConstructCheckedINTEL_token: 6195
21+
1622

1723
:DPCPP_URL: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_intel_matrix.asciidoc
1824
:bfloat16_conv_url: http://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_bfloat16_conversion.html
@@ -67,7 +73,7 @@ please let us know!
6773
[width="40%",cols="25,25"]
6874
|========================================
6975
| Last Modified Date | 2023-11-06
70-
| Revision | 15
76+
| Revision | 16
7177
|========================================
7278

7379
== Dependencies
@@ -116,6 +122,7 @@ This extension introduces new capabilities:
116122
{invocation_capability_name}
117123
{tf32_capability_name}
118124
{bf16_capability_name}
125+
{capability_checked_name}
119126
{capability_prefetch_name}
120127
----
121128

@@ -137,6 +144,15 @@ OpCooperativeMatrixPrefetchINTEL
137144
138145
----
139146

147+
Instructions added under the *{capability_checked_name}* capability:
148+
149+
----
150+
151+
OpCooperativeMatrixLoadCheckedINTEL
152+
OpCooperativeMatrixStoreCheckedINTEL
153+
OpCooperativeMatrixConstructCheckedINTEL
154+
155+
----
140156

141157
== Token Number Assignments
142158

@@ -149,9 +165,13 @@ OpCooperativeMatrixPrefetchINTEL
149165
|*{tf32_capability_name}* | {tf32_capability_token}
150166
|*{bf16_capability_name}* | {bf16_capability_token}
151167
|*{capability_prefetch_name}* | {capability_prefetch_token}
168+
|*{capability_checked_name}* | {capability_checked_token}
152169
|*OpCooperativeMatrixGetElementCoordINTEL* | {OpCooperativeMatrixGetElementCoordINTEL_token}
153170
|*OpCooperativeMatrixApplyFunctionINTEL* | {OpCooperativeMatrixApplyFunctionINTEL_token}
154171
|*OpCooperativeMatrixPrefetchINTEL* | {OpCooperativeMatrixPrefetchINTEL_token}
172+
|*OpCooperativeMatrixLoadCheckedINTEL* | {OpCooperativeMatrixLoadCheckedINTEL_token}
173+
|*OpCooperativeMatrixStoreCheckedINTEL* | {OpCooperativeMatrixStoreCheckedINTEL_token}
174+
|*OpCooperativeMatrixConstructCheckedINTEL* | {OpCooperativeMatrixConstructCheckedINTEL_token}
155175
|====
156176

157177
== Modifications to the SPIR-V Specification, Version 1.6 and SPV_KHR_cooperative_matrix, Revision 3
@@ -231,6 +251,13 @@ Uses *BFloat16* in 3.X, Cooperative Matrix Operands +
231251
Uses *OpCooperativeMatrixPrefetchINTEL* instructions. +
232252
+
233253
| *{main_capability_name}* +
254+
| {capability_checked_token} | *{capability_checked_name}* +
255+
+
256+
Uses *OpCooperativeMatrixLoadCheckedINTEL* and *OpCooperativeMatrixStoreCheckedINTEL*
257+
instructions. +
258+
+
259+
| *{main_capability_name}* +
260+
234261
|====
235262
--
236263

@@ -259,13 +286,11 @@ whose 'Type' operand is a scalar or vector type. If the *Shader* capability was
259286
declared, 'Pointer' must point into an array and any *ArrayStride* decoration on
260287
'Pointer' is ignored. +
261288
+
262-
'X offset' must be a constant instruction with scalar 32-bit integer type.
263-
It specifies offset in bytes along X axis from the 'Pointer' where prefetched
264-
memory region starts from. +
289+
'X offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
290+
along X axis from the 'Pointer' where the prefetched memory region starts from. +
265291
+
266-
'Y offset' must be a constant instruction with scalar 32-bit integer type.
267-
It specifies offset in bytes along Y axis from the 'Pointer' where prefetched
268-
memory region starts from. +
292+
'Y offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
293+
along Y axis from the 'Pointer' where the prefetched memory region starts from. +
269294
+
270295
'Rows' must be a constant instruction with scalar 32-bit integer type. +
271296
+
@@ -297,6 +322,169 @@ scalar 'integer type' and its exact semantics depend on 'MemoryLayout'. +
297322
'Stride' |
298323
|=====
299324

325+
[cols="1,1,10*3",width="100%"]
326+
|=====
327+
11+|[[OpCooperativeMatrixLoadCheckedINTEL]]*OpCooperativeMatrixLoadCheckedINTEL* +
328+
+
329+
Load a cooperative matrix through a pointer. Global matrix size might be not multiple the size of
330+
the two-dimentional region that is being loaded, in this case the out-of-bounds elements are
331+
set to 0. +
332+
+
333+
'Result Type' is the type of the loaded object. It must be a cooperative matrix
334+
type. +
335+
+
336+
'X offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
337+
along X axis from the 'Pointer' where the loaded memory region starts from. +
338+
+
339+
'Y offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
340+
along Y axis from the 'Pointer' where the loaded memory region starts from. +
341+
+
342+
'Pointer' is a pointer. Its type must be an *OpTypePointer* whose 'Type' operand
343+
is a scalar or vector type. If the *Shader* capability was declared, 'Pointer'
344+
must point into an array and any *ArrayStride* decoration on 'Pointer' is ignored. +
345+
+
346+
'MemoryLayout' specifies how matrix elements are laid out in memory. It must come
347+
from a 32-bit integer 'constant instruction' whose value corresponds to a
348+
'Cooperative Matrix Layout'. See the _Cooperative Matrix Layout_ table for
349+
a description of the layouts and detailed layout-specific rules. +
350+
+
351+
'Height' is the height (number of rows of a big matrix) of the two-dimensional
352+
region to load the matrix from. It must be a scalar 'integer type'. +
353+
+
354+
'Width' is the width (number of columns of a big matrix) of the two-dimensional
355+
region to load the matrix from. It must be a scalar 'integer type'. +
356+
+
357+
'Stride' further qualifies how matrix elements are laid out in memory. It must be a
358+
scalar 'integer type' and its exact semantics depend on 'MemoryLayout'. +
359+
+
360+
'Memory Operand' must be a +Memory Operand+ literal. If not present, it is the
361+
same as specifying *None*. +
362+
+
363+
For a given dynamic instance of this instruction, all operands of this
364+
instruction must be the same for all invocations in a given scope instance
365+
(where the scope is the scope the cooperative matrix type was created with).
366+
All invocations in a given scope instance must be active or all must be
367+
inactive. +
368+
+
369+
Note: To specify cache level for *OpCooperativeMatrixLoadCheckedINTEL* one
370+
can use *CacheControlLoadINTEL* decoration from {cache_control_url}[SPV_INTEL_cache_controls extension]. +
371+
+
372+
1+|Capability: +
373+
*{capability_checked_name}*
374+
1+| 9+variable | {OpCooperativeMatrixLoadCheckedINTEL_token} | '<id>' +
375+
'Result Type' |'Result <id>' | '<id>' +
376+
'Pointer' | '<id>' +
377+
'X offset' | '<id>' +
378+
'Y offset' | '<id>' +
379+
'MemoryLayout' | '<id>' +
380+
'Height' | '<id>' +
381+
'Width' | Optional '<id>' +
382+
'Stride' | Optional +
383+
'Memory Operand' |
384+
|=====
385+
386+
[cols="1,1,9*3",width="100%"]
387+
|=====
388+
10+|[[OpCooperativeMatrixStoreCheckedINTEL]]*OpCooperativeMatrixStoreCheckedINTEL* +
389+
+
390+
Store a cooperative matrix through a pointer. Global matrix size might be not multiple the size of
391+
the region to which it is stored, in this case the out-of-bounds elements are
392+
dropped. +
393+
+
394+
'Pointer' is a pointer. Its type must be an *OpTypePointer* whose 'Type' operand
395+
is a scalar or vector type. If the *Shader* capability was declared, 'Pointer'
396+
must point into an array and any *ArrayStride* decoration on 'Pointer' is ignored. +
397+
+
398+
'X offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
399+
along X axis from the 'Pointer' where the stored memory region starts from. +
400+
+
401+
'Y offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
402+
along Y axis from the 'Pointer' where the stored memory region starts from. +
403+
+
404+
'Object' is the object to store. Its type must be a _cooperative matrix_. +
405+
+
406+
'MemoryLayout' specifies how matrix elements are laid out in memory. It must come
407+
from a 32-bit integer 'constant instruction' whose value corresponds to a
408+
'Cooperative Matrix Layout'. See the _Cooperative Matrix Layout_ table for
409+
a description of the layouts and detailed layout-specific rules. +
410+
+
411+
'Height' is the height (number of rows of a big matrix) of the two-dimensional
412+
region to load the matrix from. It must be a scalar 'integer type'. +
413+
+
414+
'Width' is the width (number of columns of a big matrix) of the two-dimensional
415+
region to load the matrix from. It must be a scalar 'integer type'. +
416+
+
417+
'Stride' further qualifies how matrix elements are laid out in memory. It must be a
418+
scalar 'integer type' and its exact semantics depend on 'MemoryLayout'. +
419+
+
420+
'Memory Operand' must be a +Memory Operand+ literal. If not present, it is the
421+
same as specifying *None*. +
422+
+
423+
For a given dynamic instance of this instruction, all operands of this
424+
instruction must be the same for all invocations in a given scope instance
425+
(where the scope is the scope the cooperative matrix type was created with).
426+
All invocations in a given scope instance must be active or all must be
427+
inactive. +
428+
+
429+
Note: To specify cache level for *OpCooperativeMatrixStoreCheckedINTEL* one
430+
can use *CacheControlStoreINTEL* decoration from {cache_control_url}[SPV_INTEL_cache_controls extension]. +
431+
+
432+
1+|Capability: +
433+
*{capability_checked_name}*
434+
1+| 8+variable | {OpCooperativeMatrixStoreCheckedINTEL_token} | '<id>' +
435+
'Pointer' | '<id>' +
436+
'X offset' | '<id>' +
437+
'Y offset' | '<id>' +
438+
'Object' | '<id>' +
439+
'MemoryLayout' | '<id>' +
440+
'Height' | '<id>' +
441+
'Width' | Optional '<id>' +
442+
'Stride' | Optional +
443+
'Memory Operand' |
444+
|=====
445+
446+
[cols="1,1,7*3",width="100%"]
447+
|=====
448+
8+|[[OpCooperativeMatrixConstructCheckedINTEL]]*OpCooperativeMatrixConstructCheckedINTEL* +
449+
+
450+
Construct a new _cooperative matrix_. It assignes 'Value' to elements in a range from
451+
'X offset' to 'Height' and 'Y offset' to 'Width' setting the rest elements to zero. +
452+
+
453+
'Result Type' is the type of the constructed object. It must be a cooperative matrix
454+
type. +
455+
+
456+
'X offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
457+
along X axis for the initialized two-dimensional region. +
458+
+
459+
'Y offset' must be a scalar 32-bit integer type. It specifies offset in number of elements
460+
along Y axis for the initialized two-dimensional region. +
461+
+
462+
'Height' is the height (number of rows of a big matrix) of the initialized two-dimensional region.
463+
It must be a scalar 'integer type'. +
464+
+
465+
'Width' is the width (number of columns of a big matrix) of the initialized two-dimensional region.
466+
It must be a scalar 'integer type'. +
467+
+
468+
'Value' is an initializer value for the constructed object. It must have the same type
469+
as an element type of the 'Result Type'. +
470+
+
471+
For a given dynamic instance of this instruction, all operands of this
472+
instruction must be the same for all invocations in a given scope instance
473+
(where the scope is the scope the cooperative matrix type was created with).
474+
All invocations in a given scope instance must be active or all must be
475+
inactive. +
476+
+
477+
1+|Capability: +
478+
*{capability_checked_name}*
479+
1+| 7 | {OpCooperativeMatrixConstructCheckedINTEL_token} | '<id>' +
480+
'Result Type' |'Result <id>' | '<id>' +
481+
'X offset' | '<id>' +
482+
'Y offset' | '<id>' +
483+
'Height' | '<id>' +
484+
'Width' | '<id>' +
485+
'Value' |
486+
|=====
487+
300488
==== 3.42.11. Conversion Instructions
301489

302490
If *{bf16_capability_name}* and *BFloat16ConversionINTEL* capabilities are
@@ -324,8 +512,8 @@ Returns (Row, Column) coordinate of dynamically selected element of a matrix. +
324512
contains the row with the selected element, and the second element contains the
325513
column with the selected element. +
326514
+
327-
'Matrix' is an ID of *OpTypeCooperativeMatrixKHR*. The instruction returns the
328-
element's coordinate of this cooperative matrix type. +
515+
'Matrix' is a _cooperative matrix_. The instruction returns the
516+
element's coordinate of the _cooperative matrix_. +
329517
+
330518
'Index' must be a 32-bit 'scalar integer'. It is interpreted as an index into the list
331519
of components owned by this work-item in the cooperative matrix. The behavior is
@@ -342,53 +530,43 @@ that *OpCooperativeMatrixLengthKHR* returns for this work-item. +
342530
| '<id>' +
343531
'Matrix'
344532
| '<id>' +
345-
'Index'
533+
'Index' |
346534
|=====
347535

348-
[cols="1,1,5*3",width="100%"]
536+
[cols="1,1,4*3",width="100%"]
349537
|=====
350-
6+|[[OpCooperativeMatrixApplyFunctionINTEL]]*OpCooperativeMatrixApplyFunctionINTEL* +
538+
5+|[[OpCooperativeMatrixApplyFunctionINTEL]]*OpCooperativeMatrixApplyFunctionINTEL* +
539+
+
540+
*NOTE* the instruction is experimental. +
351541
+
352-
Apply the function for each element of the matrix. Results in a new matrix within
542+
Apply the function object for each element of the matrix. Results in a new matrix within
353543
the same scope and with the same number of rows and columns. +
354544
+
355545
'Result Type' is the type of the return value of the function. It must be an
356-
*OpTypeCooperativeMatrix* with the same _Scope_, _Rows_ and _Columns_ as the type of
546+
*OpTypeCooperativeMatrixKHR* with the same _Scope_, _Rows_ and _Columns_ as the type of
357547
'Matrix' operand. _Component type_ as well as _Use_ of 'Result Type' and 'Matrix' can
358548
differ. +
359549
+
360-
'Function' is an *OpFunction* instruction whose *OpTypeFunction* operand has _Result Type_
361-
of scalar _numerical type_. This could be a forward reference. The 'Function' will be
362-
invoked (_Rows_ - 'Y')_x_(_Cols_ - 'X') times within the cooperative matrix scope. The first parameter of the
363-
'Function' must be scalar _numerical type_ that corresponds to an element of
364-
the matrix to which 'Function' is being applied.
550+
'Function object' must be a *OpTypePointer* with *OpTypeStruct* _Type_.
551+
The 'Function object' will be invoked within the cooperative matrix scope.
365552
+
366553
'Matrix' is a cooperative matrix which elements are used as the first parameter of
367554
the 'Function'. +
368555
+
369-
'Argument N' is the object to copy to parameter N. +
370-
+
371-
*Note* the first parameter is omitted in this list of parameters, as it is copied
372-
from the unique element of the 'Matrix'. Following two parameters must be (X, Y)
373-
coordinate of a first element of the matrix to apply the function, for example
374-
(0, 0) would mean, that *OpCooperativeMatrixApplyFunctionINTEL* affects the
375-
entire matrix. +
376-
+
377556

378557
1+|Capability: +
379558
*{invocation_capability_name}*
380-
1+| 4 + variable | {OpCooperativeMatrixApplyFunctionINTEL_token}
559+
1+| 4 | {OpCooperativeMatrixApplyFunctionINTEL_token}
381560
| '<id>' +
382561
'Result Type'
383562
| 'Result <id>'
384563
| '<id>' +
385-
'Function'
564+
'Function object'
386565
| '<id>' +
387566
'Matrix'
388-
| '<id>, <id>, ..., <id>' +
389-
'Argument 1', 'Argument 2', ..., 'Argument N'
390567
|=====
391568

569+
392570
=== Issues
393571

394572
1. Should we keep *OpCooperativeMatrixGetElementCoordINTEL* once we have *OpCooperativeMatrixApplyFunctionINTEL*? +
@@ -419,4 +597,5 @@ Revision History
419597
|13|2023-09-25|Dmitry Sidorov|Add convertion instructions for tf32 and bf16
420598
|14|2023-10-11|Dmitry Sidorov|Add matrix prefetch instruction
421599
|15|2023-11-06|Dmitry Sidorov|Put deprecation note on OpCooperativeMatrixGetElementCoordINTEL
600+
|16|2023-11-06|Dmitry Sidorov|Add checked load, store and construct instructions
422601
|========================================

0 commit comments

Comments
 (0)