Skip to content

Commit a818156

Browse files
committed
Support automatic / primary sub-group sizes
Describe how we will bundle kernels that have a "named" required sub-group size (i.e. decorated with `[[intel::named_sub_group_size(NAME)]]`.
1 parent da9d05e commit a818156

File tree

1 file changed

+11
-7
lines changed

1 file changed

+11
-7
lines changed

sycl/doc/OptionalDeviceFeatures.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -489,10 +489,15 @@ Therefore, two kernels or exported device functions are only bundled together
489489
into the same device image if all of the following are true:
490490

491491
* They share the same set of *Used* aspects,
492-
* They either both have no required sub-group size or both have the same
493-
required sub-group size, and
494492
* They either both have no required work-group size or both have the same
495-
required work-group size.
493+
required work-group size, and
494+
* They either both have the same numeric value for their required sub-group
495+
size or neither has a numeric value for a required sub-group size. (Note
496+
that this implies that kernels decorated with
497+
`[[intel::named_sub_group_size(automatic)]]` can be bundled together with
498+
kernels that are decorated with `[[intel::named_sub_group_size(primary)]]`
499+
and that either of these kernels could be bundled with a kernel that has no
500+
required sub-group size.)
496501

497502
These criteria are an additional filter applied to the device code split
498503
algorithm after taking into account the `-fsycl-device-code-split` command line
@@ -526,10 +531,9 @@ property (which is always divisible by `4`) tells the number of aspects in the
526531
array.
527532

528533
There is a "reqd\_sub\_group\_size" property if the image contains any kernels
529-
with a required sub-group size. The value of the property is a `uint32` value
530-
that tells the required size. (The device code split algorithm ensures that
531-
there are never two kernels with different required sub-group sizes in the same
532-
image.)
534+
with a numeric required sub-group size. (I.e. this excludes kernels where the
535+
required sub-group size is a named value like `automatic` or `primary`.) The
536+
value of the property is a `uint32` value that tells the required size.
533537

534538
There is a "reqd\_work\_group\_size" property if the image contains any kernels
535539
with a required work-group size. The value of the property is a `BYTE_ARRAY`

0 commit comments

Comments
 (0)