Skip to content

Commit 98715ae

Browse files
authored
[SYCL][Doc] Use marray for sub-group loads/stores (#2167)
vec is only compatible with built-in scalar data types. Using marray instead will allow sub-group loads/stores to eventually support user-defined types and other C++ types like std::complex. Using marray is also clearer, since the use of vec alongside sub-groups (which may represent a SIMD vector in some implementations) has confused some users. Signed-off-by: John Pennycook <[email protected]>
1 parent be7c1cb commit 98715ae

File tree

1 file changed

+17
-10
lines changed

1 file changed

+17
-10
lines changed

sycl/doc/extensions/SubGroupAlgorithms/SYCL_INTEL_sub_group_algorithms.asciidoc

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ This document describes an extension which introduces a library of sub-group fun
3131

3232
== Notice
3333

34-
Copyright (c) 2020 Intel Corporation. All rights reserved.
34+
Copyright (c) 2020-2021 Intel Corporation. All rights reserved.
3535

3636
== Status
3737

@@ -55,6 +55,7 @@ This extension is written against the SYCL 1.2.1 specification, Revision 6 and t
5555

5656
- +SYCL_INTEL_group_algorithms+
5757
- +SYCL_INTEL_sub_group+
58+
- +SYCL_INTEL_math_array+
5859

5960
== Overview
6061

@@ -82,9 +83,9 @@ It additionally introduces a number of functions that are currently specific to
8283

8384
=== Data Types
8485

85-
All functions are supported for the fundamental scalar types supported by SYCL and instances of the SYCL +vec+ class. The fundamental scalar types (as defined in Section 6.5 of the SYCL 1.2.1 specification) are: +bool+, +char+, +signed char+, +unsigned char+, +short int+, +unsigned short int+, +int+, +unsigned int+, +long int+, +unsigned long int+, +long long int+, +unsigned long long int+, +size_t+, +float+, +double+, +half+.
86+
All functions are supported for the fundamental scalar types supported by SYCL and instances of the SYCL +vec+ and +marray+ classes. The fundamental scalar types (as defined in Section 6.5 of the SYCL 1.2.1 specification) are: +bool+, +char+, +signed char+, +unsigned char+, +short int+, +unsigned short int+, +int+, +unsigned int+, +long int+, +unsigned long int+, +long long int+, +unsigned long long int+, +size_t+, +float+, +double+, +half+.
8687

87-
Functions with arguments of type +vec<T,N>+ are applied component-wise: they are semantically equivalent to N calls to a scalar function of type +T+.
88+
Functions with arguments of type +vec<T,N>+ or +marray<T,N>+ are applied component-wise: they are semantically equivalent to N calls to a scalar function of type +T+.
8889

8990
=== Functions
9091

@@ -135,22 +136,27 @@ The load and store sub-group functions enable developers to assert that all work
135136
|Function|Description
136137

137138
|+template <typename T> T load(sub_group sg, const T *src)+
138-
|Load contiguous data from _src_. Returns one element per work-item, corresponding to the memory location at _src_ + +get_local_id()+. The value of _src_ must be the same for all work-items in the sub-group. The address space information is deduced automatically. Only pointers to global and local address spaces are valid. Passing a pointer to other address spaces will cause the run time assertion.
139+
|Load contiguous data from _src_. Returns one element per work-item, corresponding to the memory location at _src_ + +get_local_id()+. The value of _src_ must be the same for all work-items in the sub-group. The address space information is deduced automatically. Only pointers to global and local address spaces are valid. Passing a pointer to other address spaces will cause the run time assertion. +T+ must be a _NumericType_.
139140

140141
|+template <typename T, access::address_space Space> T load(sub_group sg, const multi_ptr<T,Space> src)+
141-
|Load contiguous data from _src_. Returns one element per work-item, corresponding to the memory location at _src_ + +get_local_id()+. The value of _src_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+.
142+
|Load contiguous data from _src_. Returns one element per work-item, corresponding to the memory location at _src_ + +get_local_id()+. The value of _src_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+. +T+ must be a _NumericType_.
142143

143-
|+template <int N, typename T, access::address_space Space> vec<T,N> load(sub_group sg, const multi_ptr<T,Space> src)+
144-
|Load contiguous data from _src_. Returns _N_ elements per work-item, corresponding to the _N_ memory locations at _src_ + +i+ * +get_max_local_range()+ + +get_local_id()+ for +i+ between 0 and _N_. The value of _src_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+.
144+
|+template <int N, typename T, access::address_space Space> __unspecified__ load(sub_group sg, const multi_ptr<T,Space> src)+
145+
|Load contiguous data from _src_. Returns _N_ elements per work-item, corresponding to the _N_ memory locations at _src_ + +i+ * +get_max_local_range()+ + +get_local_id()+ for +i+ between 0 and _N_. The return type is implicitly convertible to +vec<T,N>+ (if +T+ is compatible with the +vec+ interface) and to +marray<T,N>+. The value of _src_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+. +T+ must be a _NumericType_.
145146

146147
|+template <typename T> void store(sub_group sg, T *dst, const T& x)+
147-
|Store contiguous data to _dst_. The value of _x_ from each work-item is written to the memory location at _dst_ + +get_local_id()+. The value of _dst_ must be the same for all work-items in the sub-group. The address space information is deduced automatically. Only pointers to global and local address spaces are valid. Passing a pointer to other address spaces will cause the run time assertion.
148+
|Store contiguous data to _dst_. The value of _x_ from each work-item is written to the memory location at _dst_ + +get_local_id()+. The value of _dst_ must be the same for all work-items in the sub-group. The address space information is deduced automatically. Only pointers to global and local address spaces are valid. Passing a pointer to other address spaces will cause the run time assertion. +T+ must be a _NumericType_.
148149

149150
|+template <typename T, access::address_space Space> void store(sub_group sg, multi_ptr<T,Space> dst, const T& x)+
150-
|Store contiguous data to _dst_. The value of _x_ from each work-item is written to the memory location at _dst_ + +get_local_id()+. The value of _dst_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+.
151+
|Store contiguous data to _dst_. The value of _x_ from each work-item is written to the memory location at _dst_ + +get_local_id()+. The value of _dst_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+. +T+ must be a _NumericType_.
151152

152153
|+template <int N, typename T, access::address_space Space> void store(sub_group sg, multi_ptr<T,Space> dst, const vec<T,N>& x)+
153-
|Store contiguous data to _dst_. The _N_ elements from each work-item are written to the memory locations at _dst_ + +i+ * +get_max_local_range()+ + +get_local_id()+ for +i+ between 0 and _N_. The value of _dst_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+.
154+
|Store contiguous data to _dst_. The _N_ elements from each work-item are written to the memory locations at _dst_ + +i+ * +get_max_local_range()+ + +get_local_id()+ for +i+ between 0 and _N_. The value of _dst_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+. +T+ must be a _NumericType_.
155+
156+
|+template <int N, typename T, access::address_space Space> void store(sub_group sg, multi_ptr<T,Space> dst, const marray<T,N>& x)+
157+
|Store contiguous data to _dst_. The _N_ elements from each work-item are written to the memory locations at _dst_ + +i+ * +get_max_local_range()+ + +get_local_id()+ for +i+ between 0 and _N_. The value of _dst_ must be the same for all work-items in the sub-group. _Space_ must be +access::address_space::global_space+ or +access::address_space::local_space+. +T+ must be a _NumericType_.
158+
|===
159+
154160
|===
155161

156162
== Issues
@@ -172,6 +178,7 @@ None.
172178
|Rev|Date|Author|Changes
173179
|1|2020-03-16|John Pennycook|*Initial public working draft*
174180
|2|2021-02-26|Vladimir Lazarev|*Add load/store method for raw pointers*
181+
|3|2021-04-06|John Pennycook|*Use sycl::marray in place of sycl::vec for load/store*
175182
|========================================
176183
177184
//************************************************************************

0 commit comments

Comments
 (0)