Skip to content

[SYCL] Make intel specific device info descriptors namespace qualified #6639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Sep 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,15 @@ As encouraged by the SYCL specification, a feature-test macro, `SYCL_EXT_ONEAPI_

| Device descriptors | Return type | Description |
| ------------------------------------------------------ | ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| info::device::ext_oneapi_max_work_groups_1d |  id<1> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<1>`. The minimum value is `(1)` if the device is different than `info::device_type::custom`. |
| info::device::ext_oneapi_max_work_groups_2d |  id<2> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<2>`. The minimum value is `(1, 1)` if the device is different than `info::device_type::custom`. |
| info::device::ext_oneapi_max_work_groups_3d |  id<3> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<3>`. The minimum value is `(1, 1, 1)` if the device is different than `info::device_type::custom`. |
| info::device::ext_oneapi_max_global_work_groups |  size_t | Returns the maximum number of work-groups that can be submitted across all the dimensions. The minimum value is `1`. |
| ext::oneapi::experimental::info::device::max_work_groups<1> |  id<1> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<1>`. The minimum value is `(1)` if the device is different than `info::device_type::custom`. |
| ext::oneapi::experimental::info::device::max_work_groups<2> |  id<2> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<2>`. The minimum value is `(1, 1)` if the device is different than `info::device_type::custom`. |
| ext::oneapi::experimental::info::device::max_work_groups<3> |  id<3> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<3>`. The minimum value is `(1, 1, 1)` if the device is different than `info::device_type::custom`. |
| ext::oneapi::experimental::info::device::max_global_work_groups |  size_t | Returns the maximum number of work-groups that can be submitted across all the dimensions. The minimum value is `1`. |

### Note

- The returned values have the same ordering as the `nd_range` arguments.
- The implementation does not guarantee that the user could select all the maximum numbers returned by `ext_oneapi_max_work_groups` at the same time. Thus the user should also check that the selected number of work-groups across all dimensions is smaller than the maximum global number returned by `ext_oneapi_max_global_work_groups`.
- The implementation does not guarantee that the user could select all the maximum numbers returned by `max_work_groups` at the same time. Thus the user should also check that the selected number of work-groups across all dimensions is smaller than the maximum global number returned by `max_global_work_groups`.

## Examples

Expand All @@ -38,8 +38,8 @@ sycl::device gpu = sycl::device{sycl::gpu_selector{}};
std::cout << gpu.get_info<sycl::info::device::name>() << '\n';

#ifdef SYCL_EXT_ONEAPI_MAX_WORK_GROUP_QUERY
sycl::id<3> groups = gpu.get_info<sycl::info::device::ext_oneapi_max_work_groups_3d>();
size_t global_groups = gpu.get_info<sycl::info::device::ext_oneapi_max_global_work_groups>();
sycl::id<3> groups = gpu.get_info<sycl::ext::oneapi::experimental::info::device::max_work_groups<3>>();
size_t global_groups = gpu.get_info<sycl::ext::oneapi::experimental::info::device::max_global_work_groups>();
std::cout << "Max number groups: x_max: " << groups[2] << " y_max: " << groups[1] << " z_max: " << groups[0] << '\n';
std::cout << "Max global number groups: " << global_groups << '\n';
#endif
Expand Down Expand Up @@ -68,12 +68,20 @@ assert(global_size[2] * global_size[1] * global_size[0] <= global_groups); //Mak

gpu_queue.submit(work_range, ...);
```
## Deprecated queries

## Implementation
The table below lists the soon to be removed deprecated descriptors and their replacements:

|Deprecated Descriptors| Replacement Decriptors|
| -------------------- | -------------------- |
| sycl::info::ext_oneapi_max_global_work_groups |sycl::ext::oneapi::experimental::info::max_global_work_groups |
| sycl::info::ext_oneapi_max_work_groups_*N*d | sycl::ext::oneapi::experimental::info::max_work_groups\<*N*\> |

* Note *N* can take the value 1,2, or 3

### Templated queries

Right now, DPC++ does not support templated device descriptors as they are defined in the SYCL specification section 4.6.4.2 "Device information descriptors". When the implementation supports this syntax, `ext_oneapi_max_work_groups_[1,2,3]d` should be replaced by the templated syntax: `ext_oneapi_max_work_groups<[1,2,3]>`.
## Implementation

### Consistency with existing checks

The implementation already checks when enqueuing a kernel that the global and per dimension work-group number is smaller than `std::numeric_limits<int>::max`. This check is implemented in `sycl/include/sycl/handler.hpp`. For consistency, values returned by the two device descriptors are bound by this limit.
Expand Down
56 changes: 36 additions & 20 deletions sycl/doc/extensions/supported/sycl_ext_intel_device_info.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ The extension supports this query in version 2 and later.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_device\_info\_uuid | unsigned char | Returns the device UUID|
| ext\:\:intel\:\:info\:\:device\:\:uuid | unsigned char | Returns the device UUID|


## Aspects ##
Expand All @@ -52,7 +52,7 @@ An invalid object runtime error will be thrown if the device does not support as
The UUID can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_device_info_uuid)) {
auto UUID = dev.get_info<info::device::ext_intel_device_info_uuid>();
auto UUID = dev.get_info<ext::intel::info::device::uuid>();
}


Expand All @@ -75,7 +75,7 @@ All versions of the extension support this query.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_pci\_address | std\:\:string | For Level Zero BE, returns the PCI address in BDF format: `domain:bus:device.function`.|
| ext\:\:intel\:\:info\:\:device\:\:pci\_address | std\:\:string | For Level Zero BE, returns the PCI address in BDF format: `domain:bus:device.function`.|


## Aspects ##
Expand All @@ -92,7 +92,7 @@ An invalid object runtime error will be thrown if the device does not support as
The PCI address can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_pci_address)) {
auto BDF = dev.get_info<info::device::ext_intel_pci_address>();
auto BDF = dev.get_info<ext::intel::info::device::pci_address>();
}


Expand All @@ -113,7 +113,7 @@ All versions of the extension support this query.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_gpu\_eu\_simd\_width | uint32\_t| Returns the physical SIMD width of the execution unit (EU).|
| ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_simd\_width | uint32\_t| Returns the physical SIMD width of the execution unit (EU).|


## Aspects ##
Expand All @@ -130,7 +130,7 @@ An invalid object runtime error will be thrown if the device does not support as
The physical EU SIMD width can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_gpu_eu_simd_width)) {
auto euSimdWidth = dev.get_info<info::device::ext_intel_gpu_eu_simd_width>();
auto euSimdWidth = dev.get_info<ext::intel::info::device::gpu_eu_simd_width>();
}


Expand All @@ -153,7 +153,7 @@ All versions of the extension support this query.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_gpu\__eu\_count | uint32\_t| Returns the number of execution units (EUs) associated with the Intel GPU.|
| ext\:\:intel\:\:info\:\:device\:\:gpu\__eu\_count | uint32\_t| Returns the number of execution units (EUs) associated with the Intel GPU.|


## Aspects ##
Expand All @@ -170,7 +170,7 @@ An invalid object runtime error will be thrown if the device does not support as
Then the number of EUs can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_gpu_eu_count)) {
auto euCount = dev.get_info<info::device::ext_intel_gpu_eu_count>();
auto euCount = dev.get_info<ext::intel::info::device::gpu_eu_count>();
}


Expand All @@ -189,7 +189,7 @@ All versions of the extension support this query.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_gpu\_slices | uint32\_t| Returns the number of slices.|
| ext\:\:intel\:\:info\:\:device\:\:gpu\_slices | uint32\_t| Returns the number of slices.|


## Aspects ##
Expand All @@ -206,7 +206,7 @@ An invalid object runtime error will be thrown if the device does not support as
Then the number of slices can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_gpu_slices)) {
auto slices = dev.get_info<info::device::ext_intel_gpu_slices>();
auto slices = dev.get_info<ext::intel::info::device::gpu_slices>();
}


Expand All @@ -224,7 +224,7 @@ All versions of the extension support this query.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_gpu\_subslices\_per\_slice | uint32\_t| Returns the number of subslices per slice.|
| ext\:\:intel\:\:info\:\:device\:\:gpu\_subslices\_per\_slice | uint32\_t| Returns the number of subslices per slice.|


## Aspects ##
Expand All @@ -241,7 +241,7 @@ An invalid object runtime error will be thrown if the device does not support as
Then the number of subslices per slice can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_gpu_subslices_per_slice)) {
auto subslices = dev.get_info<info::device::ext_intel_gpu_subslices_per_slice>();
auto subslices = dev.get_info<ext::intel::info::device::gpu_subslices_per_slice>();
}


Expand All @@ -259,7 +259,7 @@ All versions of the extension support this query.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_gpu\_eu\_count\_per\_subslice | uint32\_t| Returns the number of EUs in a subslice.|
| ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_count\_per\_subslice | uint32\_t| Returns the number of EUs in a subslice.|


## Aspects ##
Expand All @@ -276,7 +276,7 @@ An invalid object runtime error will be thrown if the device does not support as
Then the number of EUs per subslice can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_gpu_eu_count_per_subslice)) {
auto euCount = dev.get_info<info::device::ext_intel_gpu_eu_count_per_subslice>();
auto euCount = dev.get_info<ext::intel::info::device::gpu_eu_count_per_subslice>();
}

# Intel GPU Number of hardware threads per EU #
Expand All @@ -292,7 +292,7 @@ The extension supports this query in version 3 and later.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_gpu\_hw\_threads\_per\_eu | uint32\_t| Returns the number of hardware threads in EU.|
| ext\:\:intel\:\:info\:\:device\:\:gpu\_hw\_threads\_per\_eu | uint32\_t| Returns the number of hardware threads in EU.|


## Aspects ##
Expand All @@ -309,7 +309,7 @@ An invalid object runtime error will be thrown if the device does not support as
Then the number of hardware threads per EU can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_gpu_hw_threads_per_eu)) {
auto threadsCount = dev.get_info<info::device::ext_intel_gpu_hw_threads_per_eu>();
auto threadsCount = dev.get_info<ext::intel::info::device::gpu_hw_threads_per_eu>();
}

# Maximum Memory Bandwidth #
Expand All @@ -328,7 +328,7 @@ All versions of the extension support this query.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_max\_mem\_bandwidth | uint64\_t| Returns the maximum memory bandwidth in units of bytes\/second.|
| ext\:\:intel\:\:info\:\:device\:\:max\_mem\_bandwidth | uint64\_t| Returns the maximum memory bandwidth in units of bytes\/second.|


## Aspects ##
Expand All @@ -346,7 +346,7 @@ An invalid object runtime error will be thrown if the device does not support as
Then the maximum memory bandwidth can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_max_mem_bandwidth)) {
auto maxBW = dev.get_info<info::device::ext_intel_max_mem_bandwidth>();
auto maxBW = dev.get_info<ext::intel::info::device::max_mem_bandwidth>();
}

# Free Global Memory #
Expand All @@ -366,7 +366,7 @@ The extension supports this query in version 4 and later.

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| info\:\:device\:\:ext\_intel\_free\_memory | uint64\_t| Returns the memory avialble on the device in units of bytes.|
| ext\:\:intel\:\:info\:\:device\:\:free\_memory | uint64\_t| Returns the memory avialble on the device in units of bytes.|


## Aspects ##
Expand All @@ -384,5 +384,21 @@ An invalid object runtime error will be thrown if the device does not support as
Then the free device memory can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_free_memory)) {
auto FreeMemory = dev.get_info<info::device::ext_intel_free_memory>();
auto FreeMemory = dev.get_info<ext::intel::info::device::free_memory>();
}

# Deprecated queries #

The table below lists deprecated, that would soon be removed and their replacements:

|Deprecated Descriptors | Replacement Descriptors |
| ------------------------------- |--------------------------- |
| info\:\:device\:\:ext\_intel\_device\_info\_uuid | ext\:\:intel\:\:info\:\:device\:\:uuid |
| info\:\:device\:\:ext\_intel\_pci\_address | ext\:\:intel\:\:info\:\:device\:\:pci\_address |
| info\:\:device\:\:ext\_intel\_gpu\_eu\_simd\_width | ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_simd\_width |
| info\:\:device\:\:ext\_intel\_gpu\__eu\_count | ext\:\:intel\:\:info\:\:device\:\:gpu\__eu\_count |
| info\:\:device\:\:ext\_intel\_gpu\_slices | ext\:\:intel\:\:info\:\:device\:\:gpu\_slices |
| info\:\:device\:\:ext\_intel\_gpu\_subslices\_per\_slice | ext\:\:intel\:\:info\:\:device\:\:gpu\_subslices\_per\_slice |
|info\:\:device\:\:ext\_intel\_gpu\_eu\_count\_per\_subslice | ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_count\_per\_subslice |
| info\:\:device\:\:ext\_intel\_gpu\_hw\_threads\_per\_eu | ext\:\:intel\:\:info\:\:device\:\:gpu\_hw\_threads\_per\_eu |
| info\:\:device\:\:ext\_intel\_max\_mem\_bandwidth | ext\:\:intel\:\:info\:\:device\:\:max\_mem\_bandwidth |
15 changes: 15 additions & 0 deletions sycl/include/sycl/detail/info_desc_helpers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,21 @@ struct IsSubGroupInfo<info::kernel_device_specific::compile_sub_group_size>
};
#include <sycl/info/device_traits.def>
#undef __SYCL_PARAM_TRAITS_SPEC

#define __SYCL_PARAM_TRAITS_SPEC(Namespace, DescType, Desc, ReturnT, PiCode) \
template <> struct PiInfoCode<Namespace::info::DescType::Desc> { \
static constexpr pi_device_info value = \
static_cast<pi_device_info>(PiCode); \
}; \
template <> \
struct is_##DescType##_info_desc<Namespace::info::DescType::Desc> \
: std::true_type { \
using return_type = Namespace::info::DescType::Desc::return_type; \
};
#include <sycl/info/ext_intel_device_traits.def>
#include <sycl/info/ext_oneapi_device_traits.def>
#undef __SYCL_PARAM_TRAITS_SPEC

} // namespace detail
} // __SYCL_INLINE_NAMESPACE(_V1)
} // namespace sycl
3 changes: 2 additions & 1 deletion sycl/include/sycl/ext/intel/fpga_device_selector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ class platform_selector : public device_selector {

int operator()(const device &device) const override {
const platform &pf = device.get_platform();
const std::string &platform_name = pf.get_info<info::platform::name>();
const std::string &platform_name =
pf.get_info<sycl::info::platform::name>();
if (platform_name == device_platform_name) {
return 10000;
}
Expand Down
Loading