-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL][Doc] Update if_architecture_is extension to include NVIDIA and AMD architectures #7246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…architectures - For NVIDI adds aspect for each sm version, - For AMD adds aspect for each architecture supported by ROCm.
Don't forget to rename the file too. |
sycl/doc/extensions/experimental/sycl_ext_intel_device_architecture.asciidoc
Outdated
Show resolved
Hide resolved
…_oneapi_device_architecture.asciidoc. - Minor update to reflect recent changes on cuda architecure additions.
Thanks @bader, done. |
|`nvidia_gpu_sm30` | ||
|1 | ||
|NVIDIA Kepler architecture. | ||
|
||
|`nvidia_gpu_sm32` | ||
|1 | ||
|NVIDIA Kepler architecture. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listing the same architecture here is confusing, and I suspect some readers will struggle to understand the difference between the sm30, sm32, sm35 and sm37 lines in the table.
Why not list the compute capability too, to make it obvious that's what the number means? e.g.
nvidia_gpu_sm30
, NVIDIA Kepler architecture (compute capability 3.0)nvidia_gpu_sm32
, NVIDIA Kepler architecture (compute capability 3.2)
I think it might also be a good idea to add a non-normative note above or below the table that says something like:
"For NVIDIA GPUs, the architecture enumerator corresponds to the compute capability of the device, and ext_oneapi_architecture_is
can be used similarly to the __CUDA_ARCH__
macro in CUDA."
...since it may help folks migrating from CUDA to SYCL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @Pennycook for review. I applied your comments in.
Thanks for adding this! Could you also update the design document to describe how this will be implemented? I presume we will add new target names for the Assuming this is not going to be implemented in the same PR, we need to separate the part that is currently implemented from the part that is not yet implemented. Since this extension specification is in the "experimental" directory, it is currently implemented, and customers can rely on this specification to know how to use this feature. Simply adding to the specification breaks that contract because customers will no longer know what is vs. is not implemented. We usually solve this by creating a copy of the specification in the "proposed" directory and making that changes there. Once it is implemented, we move the proposed document to the "experimental" directory, overwriting the previous version. This way, the document in "experimental" always describes what is currently implemented. |
@@ -98,7 +98,7 @@ implementation supports. | |||
This extension adds a new enumeration of the architectures that can be tested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is really for the table above, which you did not change. Please add a new row to that table for version 2 of this specification, noting that the Nvidia and AMD architectures were added in version 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @gmlueck. done.
@@ -295,12 +337,176 @@ of these enumerators, and it provides a brief description of their meanings. | |||
|`intel_gpu_12_10_0` | |||
|1 | |||
|Alias for `intel_gpu_dg1`. | |||
|
|||
|`nvidia_gpu_sm20` | |||
|1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, change all these new entries to 2
, so that users know that these entries were added in version 2 of the specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @gmlueck. done.
- Reflects the version addition to the document.
…oc to proposed folder. - Reflect updates to sycl_ext_oneapi_device_architecture.asciidoc into DeviceIf.md - Reverts changes made to experimental/sycl_ext_intel_device_architecture.asciidoc to avoid confustion on what is and what is not yet implemented.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Just a couple of small comments below.
intel_gpu_12_0_0 = intel_gpu_tgllp, | ||
intel_gpu_12_10_0 = intel_gpu_dg1, | ||
|
||
nvidia_gpu_sm20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these "nvidia" and "amd" enumerators should go before the "alias" enumerators like intel_gpu_8_0_0
. Otherwise, the "nvidia" and "amd" enumerators will alias some of the "intel" ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's right @gmlueck, thanks . done.
@@ -249,7 +329,7 @@ constexpr static auto if_architecture_is(T fnTrue, Args ...args) { | |||
} | |||
} | |||
|
|||
} // namespace ext::intel::experimental | |||
} // namespace ext::oneapi::exprimental | |||
} // namespace sycl | |||
``` | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is for the sentence below that says:
The only supported targets are
spir64_x86_64
and the newintel_gpu_*
GPU device names.
I think that sentence should be updated to include the "nvidia" and "amd" device names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @gmlueck, done.
- Adds amd and nvidia gpus as supported targets.
Update if_architecture_is extension to include NVIDIA and AMD architectures