Skip to content

[SYCL][Doc] Add draft forward progress extension #7598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Aug 25, 2023

Conversation

Pennycook
Copy link
Contributor

This extension strengthens the forward progress guarantees provided by a SYCL implementation, and exposes the guarantees of particular implementations/backends/devices via new device queries and kernel properties.

Signed-off-by: John Pennycook [email protected]

@Pennycook Pennycook added the spec extension All issues/PRs related to extensions specifications label Nov 30, 2022
This extension strengthens the forward progress guarantees provided by a
SYCL implementation, and exposes the guarantees of particular
implementations/backends/devices via new device queries and kernel
properties.

Signed-off-by: John Pennycook <[email protected]>
@Pennycook
Copy link
Contributor Author

Note to self and future reviewers: the launch configuration queries as currently written are insufficient, because there may be dependencies between query results. For example, the maximum number of work-groups that can execute concurrently on an NVIDIA GPU depends on the work-group size (see the documentation for cudaOccupancyMaxActiveBlocksPerMultiprocessor).

In general, and looking across implementations, I think:

  • The maximum number of work-groups may depend on a user-supplied work-group size
  • The maximum work-group size may depend on a user-supplied number of sub-groups
  • The maximum number of sub-groups depends on the device
  • The maximum sub-group size depends on the kernel

@Pennycook Pennycook marked this pull request as ready for review March 13, 2023 18:11
@Pennycook Pennycook requested a review from a team as a code owner March 13, 2023 18:11
@Pennycook
Copy link
Contributor Author

@intel/dpcpp-specification-reviewers - I think this is ready for review, now.

@Pennycook Pennycook requested a review from gmlueck May 17, 2023 13:55
@Pennycook
Copy link
Contributor Author

@intel/dpcpp-specification-reviewers: Do you think this is ready to be merged? If I recall correctly, we reviewed the draft internally ahead of the IWOCL presentation. There's been no activity here for a few months.

The ability of a device to simultaneously execute work-groups may depend on the
amount of dynamically allocated work-group local memory (e.g. from accessors).

Signed-off-by: John Pennycook <[email protected]>
@Pennycook
Copy link
Contributor Author

@bashbaug: I've just tried to add the local memory query that we talked about. Please let me know if this isn't the solution you expected.

Copy link
Contributor

@AerialMantis AerialMantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just a few suggestions.

----
namespace ext::oneapi::experimental::sycl {

enum class forward_progress_guarantee {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the values of this enum represent semantics within a fixed hierarchy, should we specify that they have set values, either explicitly in the synopsis or as an implementation requirement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do this. P2300 also defines a forward_progress_guarantee enum, and does not specify values. I think we should stay aligned with that, at least for now.

I'm also worried that we might have to add additional progress guarantees at some point (either because SYCL needs them, or because C++ introduces them), and updating previously specified values might break user code.

Copy link
Contributor

@AerialMantis AerialMantis Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, having fixed values could complicate adding additional guarantees in the future, especially if they fit in between existing guarantees or if it's no longer a simpler linear hierarchy.

----
namespace ext::oneapi::experimental::sycl {

enum class execution_scope {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the comment above, since the values of this enum represent semantics within a fixed hierarchy, should we specify that they have set values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should view this as a fixed hierarchy. There may be implementations that want to add additional execution scopes in the middle of the existing hierarchy. For example, if it's possible to provide stronger forward progress guarantees for all work-groups executing on the same cluster of SMs on an NVIDIA GPU, it would make sense to add something like a cluster execution scope between work_group and root_group.

Aside: These sorts of concerns are why we split things into a coordination and target scope. Initially, we considered implicitly specifying a forward progress guarantee relative to the immediate parent (i.e. work-item forward progress guarantees would always be relative to the sub-group, and so on), but this would break user code unexpectedly if/when new layers were added to the hierarchy. By making both scopes explicit, we can add new layers to the hierarchy safely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good point, I agree it makes sense then to leave these without set values.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up! This is a complex topic and I think you've done a great job making it as approachable as possible.

- They must respect all other properties
- They must respect usage of work-group local memory
- Require a minimum of 1 if a kernel is supported
- Require a value of 0 if a kernel is not supported
- Clarify that values respect all properties
Pennycook and others added 2 commits July 24, 2023 10:04
Co-authored-by: Greg Lueck <[email protected]>
Co-authored-by: Ben Ashbaugh <[email protected]>
This is in the ISO C++ definition, but replicating it here in a non-normative
note may help readers understand the implications of the execution model.
Queries and properties relate to the intermediate threads of execution, and so
they must also be part of the associated scope(s).
This hopefully clarifies what the return value of this query really represents,
and how it might be used.
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @Pennycook. I think the meaning of the forward-progress properties are now more formally defined according to the execution model. I added a few more comments below, though.

Each work-group must be the same size, and the sub-group size has already been
determined (because the query is relative to a specific work-group size). The
number of sub-groups must therefore be known.
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want the Param::return_type and Param::argument_types to be implementation details, I think it's better to leave them out of the spec entirely. The comments below show one way to do this, which aligns with the proposed wording in the core spec in KhronosGroup/SYCL-Docs#440.

@Pennycook
Copy link
Contributor Author

If we want the Param::return_type and Param::argument_types to be implementation details, I think it's better to leave them out of the spec entirely. The comments below show one way to do this, which aligns with the proposed wording in the core spec in KhronosGroup/SYCL-Docs#440.

I prefer this too. I've made these changes in 36433dd and 1cdabef.

Note that Param::return_type is currently how SYCL 2020 defines its queries, and it's not very well defined there. It might be a good idea to swap it out for /* return-type */ at some point.

@gmlueck
Copy link
Contributor

gmlueck commented Aug 1, 2023

Note that Param::return_type is currently how SYCL 2020 defines its queries, and it's not very well defined there. It might be a good idea to swap it out for /* return-type */ at some point.

I created internal Khronos issue 663 to discuss this.

@Pennycook Pennycook requested a review from AerialMantis August 3, 2023 18:17
Copy link
Contributor

@AerialMantis AerialMantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @Pennycook, looks great! I've just added a couple of additonal comments.

@Pennycook
Copy link
Contributor Author

@intel/llvm-gatekeepers: This can be merged! 🥳

@steffenlarsen steffenlarsen merged commit 6746114 into intel:sycl Aug 25, 2023
@Pennycook Pennycook deleted the forward_progress branch August 25, 2023 17:40

[source,c++]
----
namespace ext::oneapi::experimental::sycl {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this (and other) namespace definition be sycl::ext::oneapi::experimental instead? (sycl:: in the beginning and not at the end)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Yes. I have no idea how this happened! Probably a bad vim search and replace. Good catch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec extension All issues/PRs related to extensions specifications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants