Skip to content

Commit a5d62c8

Browse files
authored
[SYCL][Doc] Remove masked_sub_group from proposal (#8308)
During implementation, it became clear that the differences between the behaviors of masked_sub_group and other SYCL groups would require developers to special-case too many functions. This is at odds with the goal to support a generic group interface. The functionality originally provided by masked_sub_group will be provided via a different mechanism. Signed-off-by: John Pennycook <[email protected]>
1 parent 68214a7 commit a5d62c8

File tree

1 file changed

+5
-219
lines changed

1 file changed

+5
-219
lines changed

sycl/doc/extensions/proposed/sycl_ext_oneapi_non_uniform_groups.asciidoc

Lines changed: 5 additions & 219 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ SYCL specification refer to that revision.
4343

4444
This extension also depends on the following other SYCL extensions:
4545

46-
* link:https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_sub_group_mask.asciidoc[sycl_ext_oneapi_sub_group_mask]
4746
* link:https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_root_group.asciidoc[sycl_ext_oneapi_root_group]
4847

4948

@@ -143,7 +142,6 @@ following user-constructed groups:
143142
- `cluster_group`
144143
- `tangle_group`
145144
- `opportunistic_group`
146-
- `masked_sub_group`
147145

148146
The `is_fixed_topology_group` and `is_user_constructed_group` traits can be
149147
used to detect whether a group type represents a fixed topology or
@@ -175,12 +173,11 @@ namespace sycl::ext::oneapi::experimental {
175173
`root_group`, `group` or `sub_group`.
176174

177175
`is_user_constructed_group<T>::value` is `std::true_type` if `T` is one of:
178-
`ballot_group`, `cluster_group`, `tangle_group`, `opportunisic_group` or
179-
`masked_sub_group`.
176+
`ballot_group`, `cluster_group`, `tangle_group`, or `opportunisic_group`.
180177

181178
Additionally, the `is_group<T>::value` trait from SYCL 2020 is `std::true_type`
182-
if `T` is one of: `ballot_group`, `cluster_group`, `tangle_group`,
183-
`opportunistic_group` or `masked_sub_group`.
179+
if `T` is one of: `ballot_group`, `cluster_group`, `tangle_group`, or
180+
`opportunistic_group`.
184181

185182

186183
=== Group Functions and Algorithms
@@ -199,13 +196,13 @@ make assumptions regarding work-item scheduling and forward progress
199196
guarantees.
200197

201198
The following group functions support the `ballot_group`, `cluster_group`,
202-
`tangle_group`, `opportunistic_group` and `masked_sub_group` group types:
199+
`tangle_group`, and `opportunistic_group` group types:
203200

204201
* `group_barrier`
205202
* `group_broadcast`
206203

207204
The following group algorithms support `ballot_group`, `cluster_group`,
208-
`tangle_group`, `opportunistic_group` and `masked_sub_group` group types:
205+
`tangle_group`, and `opportunistic_group` group types:
209206

210207
* `joint_any_of` and `any_of_group`
211208
* `joint_all_of` and `all_of_group`
@@ -898,217 +895,6 @@ int atomic_aggregate_inc(sycl::sub_group sg, sycl::atomic_ref<int, Order, Scope,
898895
----
899896

900897

901-
=== Masked Sub-groups
902-
903-
A masked sub-group is a non-contiguous subset of a sub-group, representing an
904-
arbitrary user-defined subset of work-items. The members of a masked sub-group
905-
are described by a bitmask, where a 1 denotes membership of the group.
906-
907-
The work-items within a masked sub-group retain information about the original
908-
sub-group, and many member functions of the `masked_sub_group` class reflect
909-
this. Developers are strongly recommended to use other user-constructed groups
910-
that match their use-case, both for improved performance and a simplified
911-
mental model.
912-
913-
NOTE: Masked sub-groups exist primarily to support experimentation with
914-
arbitrary subsets of work-items within a sub-group, and to support the
915-
migration of algorithms already expressed via masks.
916-
917-
918-
==== Creation
919-
920-
Masked sub-groups are created by calls to the `get_masked_sub_group()`
921-
function, which applies a bitmask to an existing sub-group.
922-
923-
NOTE: Creating a masked sub-group does not require a barrier across all
924-
work-items in the parent sub-group or introduce any sychronization, since
925-
work-items can independently identify members directly from the specified
926-
membership mask.
927-
928-
[source, c++]
929-
----
930-
namespace ext::oneapi::experimental {
931-
932-
masked_sub_group get_masked_sub_group(sub_group sg, sub_group_mask mask);
933-
934-
} // namespace ext::oneapi::experimental
935-
----
936-
937-
_Preconditions_: All work-items in `sg` with a corresponding bit set in `mask`
938-
must encounter this function in converged control flow.
939-
940-
_Returns_: A `masked_sub_group` consisting of the work-items in `sg` with a
941-
corresponding bit set in `mask`.
942-
943-
944-
==== `masked_sub_group` Class
945-
946-
The `masked_sub_group` class contains an additional `get_mask()` function,
947-
returning the membership mask. Since the other member functions of
948-
`masked_sub_group` reflect the original sub-group, developers must use this
949-
mask to reason about the local numbering of work-items within the group.
950-
951-
[source, c++]
952-
----
953-
namespace sycl::ext::oneapi::experimental {
954-
955-
class masked_sub_group {
956-
public:
957-
using id_type = id<1>;
958-
using range_type = range<1>;
959-
using linear_id_type = uint32_t;
960-
static constexpr int dimensions = 1;
961-
static constexpr sycl::memory_scope fence_scope =
962-
sycl::memory_scope::sub_group;
963-
964-
id_type get_group_id() const;
965-
966-
id_type get_local_id() const;
967-
968-
range_type get_group_range() const;
969-
970-
range_type get_local_range() const;
971-
972-
linear_id_type get_group_linear_id() const;
973-
974-
linear_id_type get_local_linear_id() const;
975-
976-
linear_id_type get_group_linear_range() const;
977-
978-
linear_id_type get_local_linear_range() const;
979-
980-
bool leader() const;
981-
982-
sub_group_mask get_mask() const;
983-
};
984-
985-
}
986-
----
987-
988-
[source,c++]
989-
----
990-
id_type get_group_id() const;
991-
----
992-
_Returns_: An `id` representing the index of the sub-group within the
993-
parent work-group.
994-
995-
[source,c++]
996-
----
997-
id_type get_local_id() const;
998-
----
999-
_Returns_: An `id` representing the calling work-item's position within
1000-
the sub-group.
1001-
1002-
[source,c++]
1003-
----
1004-
range_type get_group_range() const;
1005-
----
1006-
_Returns_: A `range` representing the number of sub-groups within the parent
1007-
work-group.
1008-
1009-
[source,c++]
1010-
----
1011-
range_type get_local_range() const;
1012-
----
1013-
_Returns_: A `range` representing the number of work-items in the sub-group.
1014-
1015-
[source,c++]
1016-
----
1017-
id_type get_group_linear_id() const;
1018-
----
1019-
_Returns_: A linearized version of the `id` returned by `get_group_id()`.
1020-
1021-
[source,c++]
1022-
----
1023-
id_type get_local_linear_id() const;
1024-
----
1025-
_Returns_: A linearized version of the `id` returned by `get_local_linear_id()`.
1026-
1027-
[source,c++]
1028-
----
1029-
range_type get_group_linear_range() const;
1030-
----
1031-
_Returns_: A linearized version of the `id` returned by `get_group_range()`.
1032-
1033-
[source,c++]
1034-
----
1035-
range_type get_local_linear_range() const;
1036-
----
1037-
_Returns_: A linearized version of the `id` returned by `get_local_range()`.
1038-
1039-
[source,c++]
1040-
----
1041-
bool leader() const;
1042-
----
1043-
_Returns_: `true` for exactly one work-item in the masked sub-group, if the
1044-
calling work-item is the leader of the masked sub-group, and `false` for all
1045-
other work-items in the masked sub-group. The leader of the masked sub-group
1046-
is guaranteed to be the work-item corresponding to the least-significant bit in
1047-
the mask.
1048-
1049-
[source,c++]
1050-
----
1051-
sub_group_mask get_mask() const;
1052-
----
1053-
_Returns_: A `sub_group_mask` representing which work-items from the sub-group
1054-
are considered a member of this `masked_sub_group`.
1055-
1056-
1057-
==== Usage Example
1058-
1059-
A `masked_sub_group` can be used to implement algorithms where a membership
1060-
mask is already present or easily computed:
1061-
1062-
[source, c++]
1063-
----
1064-
// set initial mask to full sub-group
1065-
auto sg = it.get_sub_group();
1066-
auto active = std::pow(2, sg.get_max_local_range()) - 1;
1067-
1068-
float sum = x;
1069-
for (int shift = sg.get_max_local_range() / 2; shift > 0; shift /= 2)
1070-
{
1071-
// create representation of work-items still active in this phase
1072-
auto masked_sg = sycl::ext::oneapi::experimental::get_masked_sub_group(sg, active);
1073-
1074-
// call shift only for work-items that are still active
1075-
// using the parent sub_group would have been unsafe due to divergence
1076-
sum += sycl::shift_group_left(masked_sg, x, shift);
1077-
1078-
// remove half of the work-items from the group
1079-
active >>= shift;
1080-
}
1081-
----
1082-
1083-
Note that in many cases these algorithms can be translated (manually) to use
1084-
one of the alternative group types:
1085-
1086-
[source, c++]
1087-
----
1088-
// set initial mask to full sub-group
1089-
auto sg = it.get_sub_group();
1090-
1091-
float sum = x;
1092-
for (int phase = 1; phase < sg.get_max_local_range() / 2; phase *= 2)
1093-
{
1094-
// create representation of work-items still active in this phase
1095-
auto active_group = sycl::ext::oneapi::experimental::get_tangle_group(sg);
1096-
1097-
// call shift only for work-items that are still active
1098-
// note that the shift is now 1, because of how tangle-group local IDs are defined
1099-
sum += sycl::shift_group_left(active_group, x, 1);
1100-
}
1101-
----
1102-
1103-
Or, even more simply, one of the SYCL group algorithms:
1104-
1105-
[source, c++]
1106-
----
1107-
auto sg = it.get_sub_group();
1108-
sum = sycl::reduce_over_group(sg, x, sycl::plus<>());
1109-
----
1110-
1111-
1112898
== Implementation notes
1113899

1114900
This non-normative section provides information about one possible

0 commit comments

Comments
 (0)