-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][Doc] Add draft forward progress extension #7598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This extension strengthens the forward progress guarantees provided by a SYCL implementation, and exposes the guarantees of particular implementations/backends/devices via new device queries and kernel properties. Signed-off-by: John Pennycook <[email protected]>
Note to self and future reviewers: the launch configuration queries as currently written are insufficient, because there may be dependencies between query results. For example, the maximum number of work-groups that can execute concurrently on an NVIDIA GPU depends on the work-group size (see the documentation for cudaOccupancyMaxActiveBlocksPerMultiprocessor). In general, and looking across implementations, I think:
|
@intel/dpcpp-specification-reviewers - I think this is ready for review, now. |
@intel/dpcpp-specification-reviewers: Do you think this is ready to be merged? If I recall correctly, we reviewed the draft internally ahead of the IWOCL presentation. There's been no activity here for a few months. |
The ability of a device to simultaneously execute work-groups may depend on the amount of dynamically allocated work-group local memory (e.g. from accessors). Signed-off-by: John Pennycook <[email protected]>
@bashbaug: I've just tried to add the local memory query that we talked about. Please let me know if this isn't the solution you expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just a few suggestions.
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
---- | ||
namespace ext::oneapi::experimental::sycl { | ||
|
||
enum class forward_progress_guarantee { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the values of this enum represent semantics within a fixed hierarchy, should we specify that they have set values, either explicitly in the synopsis or as an implementation requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do this. P2300 also defines a forward_progress_guarantee
enum, and does not specify values. I think we should stay aligned with that, at least for now.
I'm also worried that we might have to add additional progress guarantees at some point (either because SYCL needs them, or because C++ introduces them), and updating previously specified values might break user code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, having fixed values could complicate adding additional guarantees in the future, especially if they fit in between existing guarantees or if it's no longer a simpler linear hierarchy.
---- | ||
namespace ext::oneapi::experimental::sycl { | ||
|
||
enum class execution_scope { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the comment above, since the values of this enum represent semantics within a fixed hierarchy, should we specify that they have set values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should view this as a fixed hierarchy. There may be implementations that want to add additional execution scopes in the middle of the existing hierarchy. For example, if it's possible to provide stronger forward progress guarantees for all work-groups executing on the same cluster of SMs on an NVIDIA GPU, it would make sense to add something like a cluster
execution scope between work_group
and root_group
.
Aside: These sorts of concerns are why we split things into a coordination and target scope. Initially, we considered implicitly specifying a forward progress guarantee relative to the immediate parent (i.e. work-item forward progress guarantees would always be relative to the sub-group, and so on), but this would break user code unexpectedly if/when new layers were added to the hierarchy. By making both scopes explicit, we can add new layers to the hierarchy safely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's a good point, I agree it makes sense then to leave these without set values.
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up! This is a complex topic and I think you've done a great job making it as approachable as possible.
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
- They must respect all other properties - They must respect usage of work-group local memory
- Require a minimum of 1 if a kernel is supported - Require a value of 0 if a kernel is not supported - Clarify that values respect all properties
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Greg Lueck <[email protected]>
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Greg Lueck <[email protected]>
Co-authored-by: Ben Ashbaugh <[email protected]>
This is in the ISO C++ definition, but replicating it here in a non-normative note may help readers understand the implications of the execution model.
Queries and properties relate to the intermediate threads of execution, and so they must also be part of the associated scope(s).
This hopefully clarifies what the return value of this query really represents, and how it might be used.
5db4836
to
d7dbf7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, @Pennycook. I think the meaning of the forward-progress properties are now more formally defined according to the execution model. I added a few more comments below, though.
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_root_group.asciidoc
Outdated
Show resolved
Hide resolved
Each work-group must be the same size, and the sub-group size has already been determined (because the query is relative to a specific work-group size). The number of sub-groups must therefore be known.
sycl/doc/extensions/proposed/sycl_ext_oneapi_launch_queries.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want the Param::return_type
and Param::argument_types
to be implementation details, I think it's better to leave them out of the spec entirely. The comments below show one way to do this, which aligns with the proposed wording in the core spec in KhronosGroup/SYCL-Docs#440.
sycl/doc/extensions/proposed/sycl_ext_oneapi_launch_queries.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_launch_queries.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_launch_queries.asciidoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Greg Lueck <[email protected]>
I prefer this too. I've made these changes in 36433dd and 1cdabef. Note that |
I created internal Khronos issue 663 to discuss this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @Pennycook, looks great! I've just added a couple of additonal comments.
sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc
Outdated
Show resolved
Hide resolved
@intel/llvm-gatekeepers: This can be merged! 🥳 |
|
||
[source,c++] | ||
---- | ||
namespace ext::oneapi::experimental::sycl { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this (and other) namespace definition be sycl::ext::oneapi::experimental
instead? (sycl::
in the beginning and not at the end)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...Yes. I have no idea how this happened! Probably a bad vim search and replace. Good catch!
This extension strengthens the forward progress guarantees provided by a SYCL implementation, and exposes the guarantees of particular implementations/backends/devices via new device queries and kernel properties.
Signed-off-by: John Pennycook [email protected]