Skip to content

Commit 1a283ac

Browse files
cbauingegmlueck
andauthored
[SYCL] Add proposal for append_and_shift extension (#8902)
Proposal extends the existing shift_group_left and shift_group_right functions to first append and then shift such that all items in a sub group can have well-defined values after the shift. --------- Co-authored-by: Greg Lueck <[email protected]>
1 parent 10d2f5b commit 1a283ac

File tree

1 file changed

+192
-0
lines changed

1 file changed

+192
-0
lines changed
Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
= sycl_ext_oneapi_append_and_shift
2+
3+
:source-highlighter: coderay
4+
:coderay-linenums-mode: table
5+
6+
// This section needs to be after the document title.
7+
:doctype: book
8+
:toc2:
9+
:toc: left
10+
:encoding: utf-8
11+
:lang: en
12+
:dpcpp: pass:[DPC++]
13+
14+
// Set the default source code type in this document to C++,
15+
// for syntax highlighting purposes. This is needed because
16+
// docbook uses c++ and html5 uses cpp.
17+
:language: {basebackend@docbook:c++:cpp}
18+
19+
20+
== Notice
21+
22+
[%hardbreaks]
23+
Copyright (C) 2023-2023 Intel Corporation. All rights reserved.
24+
25+
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
26+
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
27+
permission by Khronos.
28+
29+
30+
== Contact
31+
32+
To report problems with this extension, please open a new issue at:
33+
34+
https://github.com/intel/llvm/issues
35+
36+
37+
== Dependencies
38+
39+
This extension is written against the SYCL 2020 revision 6 specification. All references below to the "core SYCL specification" or to section numbers in the SYCL specification refer to that revision.
40+
41+
42+
== Status
43+
44+
This is a proposed extension specification, intended to gather community feedback. Interfaces defined in this specification may not be implemented yet or may be in a preliminary state. The specification itself may also change in incompatible ways before it is finalized. Shipping software products should not rely on APIs defined in this specification.
45+
46+
47+
== Overview
48+
49+
The current specification and implementation of `sycl::shift_group_left` returns "the value of x from the work-item whose group local id (`id`) is delta larger than that of the calling work-item." If `id+delta` is greater or equal than the group's linear size (`sg_size`), then the value returned is unspecified. An equivalent problem occurs for `sycl::shift_group_right` if `id-delta < 0`.
50+
51+
The proposed extension of `sycl::shift_group_left` takes two values, the "current" value `x` and the "next" (or "previous") value `to_append` (`to_prepend`). If `id+delta` is less than the group's linear size, the function returns `x` of the work-item with group local id `id+delta`. If `id+delta` is greater than or equal to the group's linear size, the function returns `to_append` of the work-item with the group local id `id+delta-sg_size`. Thus, in all cases the return value is defined and valid.
52+
53+
54+
Similarly, the proposed extension of `sycl::shift_group_right` returns `x` of the work-item with group local id `id-delta`, if `id-delta` is greater than or equal to 0 and `to_prepend` of the work-item with group local id `sg_size+(id-delta)`, otherwise.
55+
56+
While this feature can easily be reproduced manually, as indicated below in the section "possible implementation", an extension is feasible since the underlying `pass:[__]spirv_SubgroupShuffleDownINTEL` and `pass:[__]spirv_SubgroupShuffleUpINTEL` already take the proposed two values and support the required capabilities.
57+
58+
=== Example
59+
Assuming the group `g` consists of 4 work items (`sg_size = 4`), which are enumerated `0-3` (`WI0`, `WI1`, `WI2`, `WI3`).
60+
Each work item holds a value `x` and a value `to_append`. To indicate the relation between a work item and its value, we enumerate the values as `x0`, `x1`, `x2`, `x3` for the values associated with work items `WI0`, `WI1`, `WI2`, `WI3`, respectively. Similarly for `to_append0`, etc.
61+
62+
Assuming `append_and_shift_group_left` is called with `delta=2`, we would get the following return values:
63+
[%header,cols="1,1,1,1,1"]
64+
|===
65+
|Work Item
66+
|WI0
67+
|WI1
68+
|WI2
69+
|WI3
70+
71+
|Input x
72+
|x0
73+
|x1
74+
|x2
75+
|x3
76+
77+
|Input to_append
78+
|to_append0
79+
|to_append1
80+
|to_append2
81+
|to_append3
82+
83+
|Returned Value
84+
|x2
85+
|x3
86+
|to_append0
87+
|to_append1
88+
|===
89+
90+
Similarly, calling `prepend_and_shift_group_right` (with enumerated `to_prepend0` to `to_prepend3` analogously to the above `to_append`) with `delta=3` would yield
91+
92+
[%header,cols="1,1,1,1,1"]
93+
|===
94+
|Work Item
95+
|WI0
96+
|WI1
97+
|WI2
98+
|WI3
99+
100+
|Input x
101+
|x0
102+
|x1
103+
|x2
104+
|x3
105+
106+
|Input to_prepend
107+
|to_prepend0
108+
|to_prepend1
109+
|to_prepend2
110+
|to_prepend3
111+
112+
|Returned Value
113+
|to_prepend1
114+
|to_prepend2
115+
|to_prepend3
116+
|x0
117+
|===
118+
119+
120+
121+
== Specification
122+
123+
=== Feature test macro
124+
This extension provides a feature-test macro as described in the core SYCL specification. An implementation supporting this extension must predefine the macro `SYCL_EXT_ONEAPI_APPEND_AND_SHIFT` to one of the values defined in the table below. Applications can test for the existence of this macro to determine if the implementation supports this feature, or applications can test the macro's value to determine which of the extension's features the implementation supports.
125+
126+
[%header,cols="1,5"]
127+
|===
128+
|Value
129+
|Description
130+
131+
|1
132+
|The APIs of this experimental extension are not versioned, so the feature-test macro always has this value.
133+
|===
134+
135+
=== API
136+
137+
```c++
138+
namespace sycl::ext::oneapi {
139+
140+
template <typename Group, typename T>
141+
T append_and_shift_group_left(Group g, T x, T to_append, Group::linear_id_type delta=1)
142+
143+
} // namespace sycl::ext::oneapi
144+
```
145+
146+
_Constraints_: Available only if `std::is_same_v<std::decay_t<Group>, sub_group>` is true and `T` is a trivially copyable type.
147+
148+
_Preconditions_: `delta` must be the same non-negative value for all work-items in the group.
149+
150+
_Returns_: The value `x` of the work-item with group local id `id+delta`, where `id` denotes the group local id of the work-item calling the function. If `id+delta` is greater than or equal to the group's linear size (`sg_size`), the function returns `to_append` of the work-item with the group local id `id+delta-sg_size`. If `id+delta` is greater than or equal to `2*sg_size` (which may happen if `delta` is greater than `sg_size`), the return value is undefined.
151+
152+
153+
154+
```c++
155+
namespace sycl::ext::oneapi {
156+
157+
template <typename Group, typename T>
158+
T prepend_and_shift_group_right(Group g, T x, T to_prepend, Group::linear_id_type delta=1)
159+
160+
} // namespace sycl::ext::oneapi
161+
```
162+
_Constraints_: Available only if `std::is_same_v<std::decay_t<Group>, sub_group>` is true and `T` is a trivially copyable type.
163+
164+
_Preconditions_: `delta` must be the same non-negative value for all work-items in the group.
165+
166+
_Returns_: The value `x` of the work-item with group local id `id-delta` if `id-delta` is greater than or equal to 0, and `to_prepend` of the work-item with group local id `sg_size+(id-delta)` otherwise. If `id-delta` is less than `-sg_size` (which may happen if `delta` is greater than `sg_size`), the return value is undefined.
167+
168+
169+
=== Possible Implementation
170+
171+
The feature can be implemented based on `pass:[__]spirv_SubgroupShuffleDownINTEL` and `pass:[__]spirv_SubgroupShuffleUpINTEL`. For devices without that capability, it can be implemented with the existing shuffle capabilities as follows:
172+
173+
```c++
174+
template <typename Group, typename T>
175+
T sycl::ext::oneapi::append_and_shift_group_left(Group g, T x, T to_append, Group::linear_id_type delta = 1)
176+
{
177+
T down_val = sycl::shift_group_left(g, x, delta);
178+
T up_val = sycl::shift_group_right(g,to_append, g.get_local_linear_range()-delta);
179+
180+
return delta+g.get_local_linear_id() > g.get_local_linear_range() ? down_val : up_val;
181+
}
182+
183+
template <typename Group, typename T>
184+
T sycl::ext::oneapi::prepend_and_shift_group_right(Group g, T x, T to_prepend, Group::linear_id_type delta = 1)
185+
{
186+
T up_val = sycl::shift_group_right(g, x, delta);
187+
T down_val = sycl::shift_group_left(g,to_prepend, g.get_local_linear_range()-delta);
188+
189+
return g.get_local_linear_id()-delta >= 0 ? up_val : down_val;
190+
}
191+
```
192+

0 commit comments

Comments
 (0)