Skip to content

[SYCL][Doc] Update if_architecture_is extension to include NVIDIA and AMD architectures #7246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 3, 2022

Conversation

mmoadeli
Copy link
Contributor

@mmoadeli mmoadeli commented Nov 1, 2022

Update if_architecture_is extension to include NVIDIA and AMD architectures

  • For NVIDIA adds aspect for each sm version,
  • For AMD adds aspect for each architecture supported by ROCm,
  • Copies updated version of experimental/sycl_ext_intel_device_architecture.asciidoc to proposed/sycl_ext_oneapi_device_architecture.asciidoc.

…architectures

- For NVIDI adds aspect for each sm version,
- For AMD adds aspect for each architecture supported by ROCm.
@mmoadeli mmoadeli requested a review from a team as a code owner November 1, 2022 12:08
@bader
Copy link
Contributor

bader commented Nov 1, 2022

Don't forget to rename the file too.

…_oneapi_device_architecture.asciidoc.

-  Minor update to reflect recent changes on cuda architecure additions.
@mmoadeli
Copy link
Contributor Author

mmoadeli commented Nov 1, 2022

Don't forget to rename the file too.

Thanks @bader, done.

Comment on lines 345 to 351
|`nvidia_gpu_sm30`
|1
|NVIDIA Kepler architecture.

|`nvidia_gpu_sm32`
|1
|NVIDIA Kepler architecture.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Listing the same architecture here is confusing, and I suspect some readers will struggle to understand the difference between the sm30, sm32, sm35 and sm37 lines in the table.

Why not list the compute capability too, to make it obvious that's what the number means? e.g.

  • nvidia_gpu_sm30, NVIDIA Kepler architecture (compute capability 3.0)
  • nvidia_gpu_sm32, NVIDIA Kepler architecture (compute capability 3.2)

I think it might also be a good idea to add a non-normative note above or below the table that says something like:

"For NVIDIA GPUs, the architecture enumerator corresponds to the compute capability of the device, and ext_oneapi_architecture_is can be used similarly to the __CUDA_ARCH__ macro in CUDA."

...since it may help folks migrating from CUDA to SYCL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Pennycook for review. I applied your comments in.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 1, 2022

Thanks for adding this!

Could you also update the design document to describe how this will be implemented? I presume we will add new target names for the -fsycl-targets compiler option? If so, it is sufficient to update that design document to add the new command line options to the list and to add the matching predefined macro names to the list. If the implementation is somehow more complicated, we should discuss.

Assuming this is not going to be implemented in the same PR, we need to separate the part that is currently implemented from the part that is not yet implemented. Since this extension specification is in the "experimental" directory, it is currently implemented, and customers can rely on this specification to know how to use this feature. Simply adding to the specification breaks that contract because customers will no longer know what is vs. is not implemented. We usually solve this by creating a copy of the specification in the "proposed" directory and making that changes there. Once it is implemented, we move the proposed document to the "experimental" directory, overwriting the previous version. This way, the document in "experimental" always describes what is currently implemented.

@@ -98,7 +98,7 @@ implementation supports.
This extension adds a new enumeration of the architectures that can be tested.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is really for the table above, which you did not change. Please add a new row to that table for version 2 of this specification, noting that the Nvidia and AMD architectures were added in version 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @gmlueck. done.

@@ -295,12 +337,176 @@ of these enumerators, and it provides a brief description of their meanings.
|`intel_gpu_12_10_0`
|1
|Alias for `intel_gpu_dg1`.

|`nvidia_gpu_sm20`
|1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, change all these new entries to 2, so that users know that these entries were added in version 2 of the specification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @gmlueck. done.

@mmoadeli mmoadeli changed the title [SYCL] Update if_architecture_is extension to include NVIDIA and AMD architectures [SYCL][Doc] Update if_architecture_is extension to include NVIDIA and AMD architectures Nov 2, 2022
- Reflects the version addition to the document.
…oc to proposed folder.

- Reflect updates to  sycl_ext_oneapi_device_architecture.asciidoc into  DeviceIf.md
- Reverts changes made to experimental/sycl_ext_intel_device_architecture.asciidoc to avoid confustion on what is and what is not yet implemented.
@mmoadeli
Copy link
Contributor Author

mmoadeli commented Nov 2, 2022

Thanks for adding this!

Could you also update the design document to describe how this will be implemented? I presume we will add new target names for the -fsycl-targets compiler option? If so, it is sufficient to update that design document to add the new command line options to the list and to add the matching predefined macro names to the list. If the implementation is somehow more complicated, we should discuss.

Assuming this is not going to be implemented in the same PR, we need to separate the part that is currently implemented from the part that is not yet implemented. Since this extension specification is in the "experimental" directory, it is currently implemented, and customers can rely on this specification to know how to use this feature. Simply adding to the specification breaks that contract because customers will no longer know what is vs. is not implemented. We usually solve this by creating a copy of the specification in the "proposed" directory and making that changes there. Once it is implemented, we move the proposed document to the "experimental" directory, overwriting the previous version. This way, the document in "experimental" always describes what is currently implemented.

@gmlueck

  • The updates are reflected into design document
  • The updated version of experimental/sycl_ext_intel_device_architecture.asciidoc is copied into proposed/sycl_ext_oneapi_device_architecture.asciidoc.
  • experimental/sycl_ext_intel_device_architecture.asciidoc is reverted to it's original state.

Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Just a couple of small comments below.

intel_gpu_12_0_0 = intel_gpu_tgllp,
intel_gpu_12_10_0 = intel_gpu_dg1,

nvidia_gpu_sm20,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these "nvidia" and "amd" enumerators should go before the "alias" enumerators like intel_gpu_8_0_0. Otherwise, the "nvidia" and "amd" enumerators will alias some of the "intel" ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right @gmlueck, thanks . done.

@@ -249,7 +329,7 @@ constexpr static auto if_architecture_is(T fnTrue, Args ...args) {
}
}

} // namespace ext::intel::experimental
} // namespace ext::oneapi::exprimental
} // namespace sycl
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is for the sentence below that says:

The only supported targets are spir64_x86_64 and the new intel_gpu_* GPU device names.

I think that sentence should be updated to include the "nvidia" and "amd" device names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @gmlueck, done.

@pvchupin pvchupin merged commit c6091df into intel:sycl Nov 3, 2022
@mmoadeli mmoadeli deleted the amd-gpu-ext-arch branch July 7, 2023 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants