Skip to content

Commit 1e0bd1e

Browse files
[ESIMD] Update documentation to reflect recent ESIMD API changes. (#3727)
* [ESIMD] Updated documentation to reflect recent changes.
1 parent 47aeafa commit 1e0bd1e

File tree

2 files changed

+26
-21
lines changed

2 files changed

+26
-21
lines changed

sycl/doc/extensions/ExplicitSIMD/README.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,21 +36,23 @@ third:
3636

3737
auto e = q.submit([&](handler &cgh) {
3838
cgh.parallel_for<class Test>(Range, [=](nd_item<1> ndi) SYCL_ESIMD_KERNEL {
39-
using namespace sycl::INTEL::gpu;
39+
using namespace sycl::ext::intel::experimental::esimd;
4040

4141
int i = ndi.get_global_id(0);
42-
simd<float, VL> va = block_load<float, VL>(A + i * VL);
43-
simd<float, VL> vb = block_load<float, VL>(B + i * VL);
42+
simd<float, VL> va;
43+
va.copy_from(A + i * VL);
44+
simd<float, VL> vb;
45+
vb.copy_from(B + i * VL);
4446
simd<float, VL> vc = va + vb;
45-
block_store<float, VL>(C + i * VL, vc);
47+
vc.copy_to(C + i * VL);
4648
});
4749
});
4850
```
4951
5052
In this example the lambda function passed to the `parallel_for` is marked with
5153
a special attribute - `SYCL_ESIMD_KERNEL`. This tells the compiler that this
5254
kernel is a ESIMD one and ESIMD APIs can be used inside it. Here the `simd`
53-
objects and `block_load`/`block_store` intrinsics are used which are avaiable
55+
objects and `copy_from`/`copy_to` intrinsics are used which are avaiable
5456
only in the ESIMD extension.
5557
Full runnable code sample can be found on the
5658
[github repo](https://github.com/intel/llvm-test-suite/blob/intel/SYCL/ESIMD/vadd_usm.cpp).

sycl/doc/extensions/ExplicitSIMD/dpcpp-explicit-simd.md

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ Explicit SIMD APIs can be used only in code to be executed on Intel Gen
3434
architecture devices and the host device for now. Attempt to run such code on
3535
other devices will result in error.
3636

37+
All the ESIMD APIs are defined in the `sycl::ext::intel::experimental::esimd` namespace.
38+
3739
Kernels and `SYCL_EXTERNAL` functions using ESP must be explicitly marked with
3840
the `[[intel::sycl_explicit_simd]]` attribute. Subgroup size query within such
3941
functions will always return `1`.
@@ -56,17 +58,18 @@ private:
5658
5759
*Lambda kernel and function*
5860
```cpp
61+
using namespace sycl::ext::intel::experimental::esimd;
5962
SYCL_EXTERNAL
60-
void sycl_device_f(sycl::global_ptr<int> ptr, sycl::intel::gpu::simd<float, 8> X) [[intel::sycl_explicit_simd]] {
61-
sycl::intel::gpu::flat_block_write(*ptr.get(), X);
63+
void sycl_device_f(sycl::global_ptr<int> ptr, simd<float, 8> X) [[intel::sycl_explicit_simd]] {
64+
flat_block_write(*ptr.get(), X);
6265
}
6366
...
6467
Q.submit([&](sycl::handler &Cgh) {
6568
auto Acc1 = Buf1.get_access<sycl::access::mode::read>(Cgh);
6669
auto Acc2 = Buf2.get_access<sycl::access::mode::read_write>(Cgh);
6770
6871
Cgh.single_task<class KernelID>([=] () [[intel::sycl_explicit_simd]] {
69-
sycl::intel::gpu::simd<float, 8> Val = sycl::intel::gpu::flat_block_read(Acc1.get_pointer());
72+
simd<float, 8> Val = flat_block_read(Acc1.get_pointer());
7073
sycl_device_f(Acc2, Val);
7174
});
7275
});
@@ -87,7 +90,7 @@ device-side API
8790
- 2D and 3D accessors
8891
- Constant accessors
8992
- `sycl::accessor::get_pointer()`. All memory accesses through an accessor are
90-
done via explicit APIs; e.g. `sycl::intel::gpu::block_store(acc, offset)`
93+
done via explicit APIs; e.g. `sycl::ext::intel::experimental::esimd::block_store(acc, offset)`
9194
- Few others (to be documented)
9295

9396
## Core Explicit SIMD programming APIs
@@ -98,15 +101,15 @@ efficient mapping to SIMD vector operations on Intel GPU architectures.
98101

99102
### SIMD vector class
100103

101-
The `sycl::intel::gpu::simd` class is a vector templated on some element type.
104+
The `simd` class is a vector templated on some element type.
102105
The element type must be vectorizable type. The set of vectorizable types is the
103106
set of fundamental SYCL arithmetic types (C++ arithmetic types or `half` type)
104107
excluding `bool`. The length of the vector is the second template parameter.
105108

106109
ESIMD compiler back-end does the best it can to map each `simd` class object to a consecutive block
107110
of registers in the general register file (GRF).
108111

109-
Every specialization of ```sycl::intel::gpu::simd``` shall be a complete type. The term
112+
Every specialization of `simd` class shall be a complete type. The term
110113
simd type refers to all supported specialization of the simd class template.
111114
To access the i-th individual data element in a simd vector, Explicit SIMD supports the
112115
standard subscript operator ```[]```, which returns by value.
@@ -215,7 +218,7 @@ To model predicated move, Explicit SIMD provides the following merge functions:
215218
```
216219
### `simd_view` class
217220

218-
The ```sycl::intel::gpu::simd_view``` represents a "window" into existing simd object,
221+
The `simd_view` represents a "window" into existing simd object,
219222
through which a part of the original object can be read or modified. This is a
220223
syntactic convenience feature to reduce verbosity when accessing sub-regions of
221224
simd objects. **RegionTy** describes the window shape and can be 1D or 2D,
@@ -231,8 +234,8 @@ different shapes and dimensions as illustrated below (`auto` resolves to a
231234
<img src="images/simd_view.svg" title="1D select example" width="800" height="300"/>
232235
</p>
233236

234-
```sycl::intel::gpu::simd_view``` class supports all the element-wise operations and
235-
other utility functions defined for ```sycl::intel::gpu::simd``` class. It also
237+
`simd_view` class supports all the element-wise operations and
238+
other utility functions defined for `simd` class. It also
236239
provides region accessors and more generic operations tailored for 2D regions,
237240
such as row/column operators and 2D select/replicate/format/merge operations.
238241

@@ -479,12 +482,12 @@ int main(void) {
479482
auto e = q.submit([&](handler &cgh) {
480483
cgh.parallel_for<class Test>(
481484
Range, [=](nd_item<1> i) [[intel::sycl_explicit_simd]] {
482-
485+
using namespace sycl::ext::intel::experimental::esimd;
483486
auto offset = i.get_global_id(0) * VL;
484-
sycl::intel::gpu<float, VL> va = sycl::intel::gpu::flat_block_load<float, VL>(A + offset);
485-
sycl::intel::gpu<float, VL> vb = sycl::intel::gpu::flat_block_load<float, VL>(B + offset);
486-
sycl::intel::gpu<float, VL> vc = va + vb;
487-
sycl::intel::gpu::flat_block_store<float, VL>(C + offset, vc);
487+
simd<float, VL> va = flat_block_load<float, VL>(A + offset);
488+
simd<float, VL> vb = flat_block_load<float, VL>(B + offset);
489+
simd<float, VL> vc = va + vb;
490+
flat_block_store<float, VL>(C + offset, vc);
488491
});
489492
});
490493
e.wait();
@@ -501,9 +504,9 @@ int main(void) {
501504

502505
- Design interoperability with SPMD context - e.g. invocation of ESIMD functions
503506
from a standard SYCL code
504-
- Generate sycl::intel::gpu API documentation from sources
507+
- Generate `sycl::ext::intel::experimental::esimd` API documentation from sources
505508
- Section covering 2D use cases
506-
- A bridge from `std::simd` to `sycl::intel::gpu::simd`
509+
- A bridge from `std::simd` to `sycl::ext::intel::experimental::esimd::simd`
507510
- Describe `simd_view` class restrictions
508511
- Support OpenCL and L0 interop for ESIMD kernels
509512
- Consider auto-inclusion of sycl_explicit_simd.hpp under -fsycl-explicit-simd option

0 commit comments

Comments
 (0)