@@ -34,6 +34,8 @@ Explicit SIMD APIs can be used only in code to be executed on Intel Gen
34
34
architecture devices and the host device for now. Attempt to run such code on
35
35
other devices will result in error.
36
36
37
+ All the ESIMD APIs are defined in the ` sycl::ext::intel::experimental::esimd ` namespace.
38
+
37
39
Kernels and ` SYCL_EXTERNAL ` functions using ESP must be explicitly marked with
38
40
the ` [[intel::sycl_explicit_simd]] ` attribute. Subgroup size query within such
39
41
functions will always return ` 1 ` .
@@ -56,17 +58,18 @@ private:
56
58
57
59
*Lambda kernel and function*
58
60
```cpp
61
+ using namespace sycl::ext::intel::experimental::esimd;
59
62
SYCL_EXTERNAL
60
- void sycl_device_f(sycl::global_ptr<int> ptr, sycl::intel::gpu:: simd<float, 8> X) [[intel::sycl_explicit_simd]] {
61
- sycl::intel::gpu:: flat_block_write(*ptr.get(), X);
63
+ void sycl_device_f(sycl::global_ptr<int> ptr, simd<float, 8> X) [[intel::sycl_explicit_simd]] {
64
+ flat_block_write(*ptr.get(), X);
62
65
}
63
66
...
64
67
Q.submit([&](sycl::handler &Cgh) {
65
68
auto Acc1 = Buf1.get_access<sycl::access::mode::read>(Cgh);
66
69
auto Acc2 = Buf2.get_access<sycl::access::mode::read_write>(Cgh);
67
70
68
71
Cgh.single_task<class KernelID>([=] () [[intel::sycl_explicit_simd]] {
69
- sycl::intel::gpu:: simd<float, 8> Val = sycl::intel::gpu:: flat_block_read(Acc1.get_pointer());
72
+ simd<float, 8> Val = flat_block_read(Acc1.get_pointer());
70
73
sycl_device_f(Acc2, Val);
71
74
});
72
75
});
@@ -87,7 +90,7 @@ device-side API
87
90
- 2D and 3D accessors
88
91
- Constant accessors
89
92
- ` sycl::accessor::get_pointer() ` . All memory accesses through an accessor are
90
- done via explicit APIs; e.g. ` sycl::intel::gpu ::block_store(acc, offset) `
93
+ done via explicit APIs; e.g. ` sycl::ext:: intel::experimental::esimd ::block_store(acc, offset) `
91
94
- Few others (to be documented)
92
95
93
96
## Core Explicit SIMD programming APIs
@@ -98,15 +101,15 @@ efficient mapping to SIMD vector operations on Intel GPU architectures.
98
101
99
102
### SIMD vector class
100
103
101
- The ` sycl::intel::gpu:: simd` class is a vector templated on some element type.
104
+ The ` simd ` class is a vector templated on some element type.
102
105
The element type must be vectorizable type. The set of vectorizable types is the
103
106
set of fundamental SYCL arithmetic types (C++ arithmetic types or ` half ` type)
104
107
excluding ` bool ` . The length of the vector is the second template parameter.
105
108
106
109
ESIMD compiler back-end does the best it can to map each ` simd ` class object to a consecutive block
107
110
of registers in the general register file (GRF).
108
111
109
- Every specialization of ``` sycl::intel::gpu:: simd``` shall be a complete type. The term
112
+ Every specialization of ` simd ` class shall be a complete type. The term
110
113
simd type refers to all supported specialization of the simd class template.
111
114
To access the i-th individual data element in a simd vector, Explicit SIMD supports the
112
115
standard subscript operator ``` [] ``` , which returns by value.
@@ -215,7 +218,7 @@ To model predicated move, Explicit SIMD provides the following merge functions:
215
218
```
216
219
### ` simd_view ` class
217
220
218
- The ``` sycl::intel::gpu:: simd_view`` ` represents a "window" into existing simd object,
221
+ The ` simd_view ` represents a "window" into existing simd object,
219
222
through which a part of the original object can be read or modified. This is a
220
223
syntactic convenience feature to reduce verbosity when accessing sub-regions of
221
224
simd objects. ** RegionTy** describes the window shape and can be 1D or 2D,
@@ -231,8 +234,8 @@ different shapes and dimensions as illustrated below (`auto` resolves to a
231
234
<img src="images/simd_view.svg" title="1D select example" width="800" height="300"/>
232
235
</p>
233
236
234
- ``` sycl::intel::gpu:: simd_view`` ` class supports all the element-wise operations and
235
- other utility functions defined for ``` sycl::intel::gpu:: simd`` ` class. It also
237
+ ` simd_view ` class supports all the element-wise operations and
238
+ other utility functions defined for ` simd ` class. It also
236
239
provides region accessors and more generic operations tailored for 2D regions,
237
240
such as row/column operators and 2D select/replicate/format/merge operations.
238
241
@@ -479,12 +482,12 @@ int main(void) {
479
482
auto e = q.submit([&](handler &cgh) {
480
483
cgh.parallel_for<class Test>(
481
484
Range, [=](nd_item<1> i) [[intel::sycl_explicit_simd]] {
482
-
485
+ using namespace sycl::ext::intel::experimental::esimd;
483
486
auto offset = i.get_global_id(0) * VL;
484
- sycl::intel::gpu <float, VL> va = sycl::intel::gpu:: flat_block_load<float, VL>(A + offset);
485
- sycl::intel::gpu <float, VL> vb = sycl::intel::gpu:: flat_block_load<float, VL>(B + offset);
486
- sycl::intel::gpu <float, VL> vc = va + vb;
487
- sycl::intel::gpu:: flat_block_store<float, VL>(C + offset, vc);
487
+ simd <float, VL> va = flat_block_load<float, VL>(A + offset);
488
+ simd <float, VL> vb = flat_block_load<float, VL>(B + offset);
489
+ simd <float, VL> vc = va + vb;
490
+ flat_block_store<float, VL>(C + offset, vc);
488
491
});
489
492
});
490
493
e.wait();
@@ -501,9 +504,9 @@ int main(void) {
501
504
502
505
- Design interoperability with SPMD context - e.g. invocation of ESIMD functions
503
506
from a standard SYCL code
504
- - Generate sycl::intel::gpu API documentation from sources
507
+ - Generate ` sycl::ext:: intel::experimental::esimd ` API documentation from sources
505
508
- Section covering 2D use cases
506
- - A bridge from ` std::simd ` to ` sycl::intel::gpu ::simd `
509
+ - A bridge from ` std::simd ` to ` sycl::ext:: intel::experimental::esimd ::simd `
507
510
- Describe ` simd_view ` class restrictions
508
511
- Support OpenCL and L0 interop for ESIMD kernels
509
512
- Consider auto-inclusion of sycl_explicit_simd.hpp under -fsycl-explicit-simd option
0 commit comments