Skip to content

Commit 0f4a0f3

Browse files
[SYCL] Make intel specific device info descriptors namespace qualified (#6639)
Conforming to SYCL 2020 specification section 6.3.1 and 4.6.4.2, to make extension information descriptors templated and within the correct namespace. -Also moved deprecated info descriptors for device into separate file -Changed namespace of the recently added [device memory extension](#6604) to ext::intel::info::device Signed-off-by: Rauf, Rana <[email protected]> Co-authored-by: Steffen Larsen <[email protected]>
1 parent b81f9df commit 0f4a0f3

16 files changed

+448
-136
lines changed

sycl/doc/extensions/experimental/sycl_ext_oneapi_max_work_group_query.md

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -21,15 +21,15 @@ As encouraged by the SYCL specification, a feature-test macro, `SYCL_EXT_ONEAPI_
2121

2222
| Device descriptors | Return type | Description |
2323
| ------------------------------------------------------ | ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
24-
| info::device::ext_oneapi_max_work_groups_1d |  id<1> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<1>`. The minimum value is `(1)` if the device is different than `info::device_type::custom`. |
25-
| info::device::ext_oneapi_max_work_groups_2d |  id<2> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<2>`. The minimum value is `(1, 1)` if the device is different than `info::device_type::custom`. |
26-
| info::device::ext_oneapi_max_work_groups_3d |  id<3> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<3>`. The minimum value is `(1, 1, 1)` if the device is different than `info::device_type::custom`. |
27-
| info::device::ext_oneapi_max_global_work_groups |  size_t | Returns the maximum number of work-groups that can be submitted across all the dimensions. The minimum value is `1`. |
24+
| ext::oneapi::experimental::info::device::max_work_groups<1> |  id<1> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<1>`. The minimum value is `(1)` if the device is different than `info::device_type::custom`. |
25+
| ext::oneapi::experimental::info::device::max_work_groups<2> |  id<2> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<2>`. The minimum value is `(1, 1)` if the device is different than `info::device_type::custom`. |
26+
| ext::oneapi::experimental::info::device::max_work_groups<3> |  id<3> | Returns the maximum number of work-groups that can be submitted in each dimension of the `globalSize` of a `nd_range<3>`. The minimum value is `(1, 1, 1)` if the device is different than `info::device_type::custom`. |
27+
| ext::oneapi::experimental::info::device::max_global_work_groups |  size_t | Returns the maximum number of work-groups that can be submitted across all the dimensions. The minimum value is `1`. |
2828

2929
### Note
3030

3131
- The returned values have the same ordering as the `nd_range` arguments.
32-
- The implementation does not guarantee that the user could select all the maximum numbers returned by `ext_oneapi_max_work_groups` at the same time. Thus the user should also check that the selected number of work-groups across all dimensions is smaller than the maximum global number returned by `ext_oneapi_max_global_work_groups`.
32+
- The implementation does not guarantee that the user could select all the maximum numbers returned by `max_work_groups` at the same time. Thus the user should also check that the selected number of work-groups across all dimensions is smaller than the maximum global number returned by `max_global_work_groups`.
3333

3434
## Examples
3535

@@ -38,8 +38,8 @@ sycl::device gpu = sycl::device{sycl::gpu_selector_v};
3838
std::cout << gpu.get_info<sycl::info::device::name>() << '\n';
3939

4040
#ifdef SYCL_EXT_ONEAPI_MAX_WORK_GROUP_QUERY
41-
sycl::id<3> groups = gpu.get_info<sycl::info::device::ext_oneapi_max_work_groups_3d>();
42-
size_t global_groups = gpu.get_info<sycl::info::device::ext_oneapi_max_global_work_groups>();
41+
sycl::id<3> groups = gpu.get_info<sycl::ext::oneapi::experimental::info::device::max_work_groups<3>>();
42+
size_t global_groups = gpu.get_info<sycl::ext::oneapi::experimental::info::device::max_global_work_groups>();
4343
std::cout << "Max number groups: x_max: " << groups[2] << " y_max: " << groups[1] << " z_max: " << groups[0] << '\n';
4444
std::cout << "Max global number groups: " << global_groups << '\n';
4545
#endif
@@ -68,12 +68,20 @@ assert(global_size[2] * global_size[1] * global_size[0] <= global_groups); //Mak
6868
6969
gpu_queue.submit(work_range, ...);
7070
```
71+
## Deprecated queries
7172

72-
## Implementation
73+
The table below lists the soon to be removed deprecated descriptors and their replacements:
74+
75+
|Deprecated Descriptors| Replacement Decriptors|
76+
| -------------------- | -------------------- |
77+
| sycl::info::ext_oneapi_max_global_work_groups |sycl::ext::oneapi::experimental::info::max_global_work_groups |
78+
| sycl::info::ext_oneapi_max_work_groups_*N*d | sycl::ext::oneapi::experimental::info::max_work_groups\<*N*\> |
79+
80+
* Note *N* can take the value 1,2, or 3
7381

74-
### Templated queries
7582

76-
Right now, DPC++ does not support templated device descriptors as they are defined in the SYCL specification section 4.6.4.2 "Device information descriptors". When the implementation supports this syntax, `ext_oneapi_max_work_groups_[1,2,3]d` should be replaced by the templated syntax: `ext_oneapi_max_work_groups<[1,2,3]>`.
83+
## Implementation
84+
7785
### Consistency with existing checks
7886

7987
The implementation already checks when enqueuing a kernel that the global and per dimension work-group number is smaller than `std::numeric_limits<int>::max`. This check is implemented in `sycl/include/sycl/handler.hpp`. For consistency, values returned by the two device descriptors are bound by this limit.

sycl/doc/extensions/supported/sycl_ext_intel_device_info.md

Lines changed: 36 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ The extension supports this query in version 2 and later.
3535

3636
| Device Descriptors | Return Type | Description |
3737
| ------------------ | ----------- | ----------- |
38-
| info\:\:device\:\:ext\_intel\_device\_info\_uuid | unsigned char | Returns the device UUID|
38+
| ext\:\:intel\:\:info\:\:device\:\:uuid | unsigned char | Returns the device UUID|
3939

4040

4141
## Aspects ##
@@ -52,7 +52,7 @@ An invalid object runtime error will be thrown if the device does not support as
5252
The UUID can be obtained using the standard get\_info() interface.
5353

5454
if (dev.has(aspect::ext_intel_device_info_uuid)) {
55-
auto UUID = dev.get_info<info::device::ext_intel_device_info_uuid>();
55+
auto UUID = dev.get_info<ext::intel::info::device::uuid>();
5656
}
5757

5858

@@ -75,7 +75,7 @@ All versions of the extension support this query.
7575

7676
| Device Descriptors | Return Type | Description |
7777
| ------------------ | ----------- | ----------- |
78-
| info\:\:device\:\:ext\_intel\_pci\_address | std\:\:string | For Level Zero BE, returns the PCI address in BDF format: `domain:bus:device.function`.|
78+
| ext\:\:intel\:\:info\:\:device\:\:pci\_address | std\:\:string | For Level Zero BE, returns the PCI address in BDF format: `domain:bus:device.function`.|
7979

8080

8181
## Aspects ##
@@ -92,7 +92,7 @@ An invalid object runtime error will be thrown if the device does not support as
9292
The PCI address can be obtained using the standard get\_info() interface.
9393

9494
if (dev.has(aspect::ext_intel_pci_address)) {
95-
auto BDF = dev.get_info<info::device::ext_intel_pci_address>();
95+
auto BDF = dev.get_info<ext::intel::info::device::pci_address>();
9696
}
9797

9898

@@ -113,7 +113,7 @@ All versions of the extension support this query.
113113

114114
| Device Descriptors | Return Type | Description |
115115
| ------------------ | ----------- | ----------- |
116-
| info\:\:device\:\:ext\_intel\_gpu\_eu\_simd\_width | uint32\_t| Returns the physical SIMD width of the execution unit (EU).|
116+
| ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_simd\_width | uint32\_t| Returns the physical SIMD width of the execution unit (EU).|
117117

118118

119119
## Aspects ##
@@ -130,7 +130,7 @@ An invalid object runtime error will be thrown if the device does not support as
130130
The physical EU SIMD width can be obtained using the standard get\_info() interface.
131131

132132
if (dev.has(aspect::ext_intel_gpu_eu_simd_width)) {
133-
auto euSimdWidth = dev.get_info<info::device::ext_intel_gpu_eu_simd_width>();
133+
auto euSimdWidth = dev.get_info<ext::intel::info::device::gpu_eu_simd_width>();
134134
}
135135

136136

@@ -153,7 +153,7 @@ All versions of the extension support this query.
153153

154154
| Device Descriptors | Return Type | Description |
155155
| ------------------ | ----------- | ----------- |
156-
| info\:\:device\:\:ext\_intel\_gpu\__eu\_count | uint32\_t| Returns the number of execution units (EUs) associated with the Intel GPU.|
156+
| ext\:\:intel\:\:info\:\:device\:\:gpu\__eu\_count | uint32\_t| Returns the number of execution units (EUs) associated with the Intel GPU.|
157157

158158

159159
## Aspects ##
@@ -170,7 +170,7 @@ An invalid object runtime error will be thrown if the device does not support as
170170
Then the number of EUs can be obtained using the standard get\_info() interface.
171171

172172
if (dev.has(aspect::ext_intel_gpu_eu_count)) {
173-
auto euCount = dev.get_info<info::device::ext_intel_gpu_eu_count>();
173+
auto euCount = dev.get_info<ext::intel::info::device::gpu_eu_count>();
174174
}
175175

176176

@@ -189,7 +189,7 @@ All versions of the extension support this query.
189189

190190
| Device Descriptors | Return Type | Description |
191191
| ------------------ | ----------- | ----------- |
192-
| info\:\:device\:\:ext\_intel\_gpu\_slices | uint32\_t| Returns the number of slices.|
192+
| ext\:\:intel\:\:info\:\:device\:\:gpu\_slices | uint32\_t| Returns the number of slices.|
193193

194194

195195
## Aspects ##
@@ -206,7 +206,7 @@ An invalid object runtime error will be thrown if the device does not support as
206206
Then the number of slices can be obtained using the standard get\_info() interface.
207207

208208
if (dev.has(aspect::ext_intel_gpu_slices)) {
209-
auto slices = dev.get_info<info::device::ext_intel_gpu_slices>();
209+
auto slices = dev.get_info<ext::intel::info::device::gpu_slices>();
210210
}
211211

212212

@@ -224,7 +224,7 @@ All versions of the extension support this query.
224224

225225
| Device Descriptors | Return Type | Description |
226226
| ------------------ | ----------- | ----------- |
227-
| info\:\:device\:\:ext\_intel\_gpu\_subslices\_per\_slice | uint32\_t| Returns the number of subslices per slice.|
227+
| ext\:\:intel\:\:info\:\:device\:\:gpu\_subslices\_per\_slice | uint32\_t| Returns the number of subslices per slice.|
228228

229229

230230
## Aspects ##
@@ -241,7 +241,7 @@ An invalid object runtime error will be thrown if the device does not support as
241241
Then the number of subslices per slice can be obtained using the standard get\_info() interface.
242242

243243
if (dev.has(aspect::ext_intel_gpu_subslices_per_slice)) {
244-
auto subslices = dev.get_info<info::device::ext_intel_gpu_subslices_per_slice>();
244+
auto subslices = dev.get_info<ext::intel::info::device::gpu_subslices_per_slice>();
245245
}
246246

247247

@@ -259,7 +259,7 @@ All versions of the extension support this query.
259259

260260
| Device Descriptors | Return Type | Description |
261261
| ------------------ | ----------- | ----------- |
262-
| info\:\:device\:\:ext\_intel\_gpu\_eu\_count\_per\_subslice | uint32\_t| Returns the number of EUs in a subslice.|
262+
| ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_count\_per\_subslice | uint32\_t| Returns the number of EUs in a subslice.|
263263

264264

265265
## Aspects ##
@@ -276,7 +276,7 @@ An invalid object runtime error will be thrown if the device does not support as
276276
Then the number of EUs per subslice can be obtained using the standard get\_info() interface.
277277

278278
if (dev.has(aspect::ext_intel_gpu_eu_count_per_subslice)) {
279-
auto euCount = dev.get_info<info::device::ext_intel_gpu_eu_count_per_subslice>();
279+
auto euCount = dev.get_info<ext::intel::info::device::gpu_eu_count_per_subslice>();
280280
}
281281

282282
# Intel GPU Number of hardware threads per EU #
@@ -292,7 +292,7 @@ The extension supports this query in version 3 and later.
292292

293293
| Device Descriptors | Return Type | Description |
294294
| ------------------ | ----------- | ----------- |
295-
| info\:\:device\:\:ext\_intel\_gpu\_hw\_threads\_per\_eu | uint32\_t| Returns the number of hardware threads in EU.|
295+
| ext\:\:intel\:\:info\:\:device\:\:gpu\_hw\_threads\_per\_eu | uint32\_t| Returns the number of hardware threads in EU.|
296296

297297

298298
## Aspects ##
@@ -309,7 +309,7 @@ An invalid object runtime error will be thrown if the device does not support as
309309
Then the number of hardware threads per EU can be obtained using the standard get\_info() interface.
310310

311311
if (dev.has(aspect::ext_intel_gpu_hw_threads_per_eu)) {
312-
auto threadsCount = dev.get_info<info::device::ext_intel_gpu_hw_threads_per_eu>();
312+
auto threadsCount = dev.get_info<ext::intel::info::device::gpu_hw_threads_per_eu>();
313313
}
314314

315315
# Maximum Memory Bandwidth #
@@ -328,7 +328,7 @@ All versions of the extension support this query.
328328

329329
| Device Descriptors | Return Type | Description |
330330
| ------------------ | ----------- | ----------- |
331-
| info\:\:device\:\:ext\_intel\_max\_mem\_bandwidth | uint64\_t| Returns the maximum memory bandwidth in units of bytes\/second.|
331+
| ext\:\:intel\:\:info\:\:device\:\:max\_mem\_bandwidth | uint64\_t| Returns the maximum memory bandwidth in units of bytes\/second.|
332332

333333

334334
## Aspects ##
@@ -346,7 +346,7 @@ An invalid object runtime error will be thrown if the device does not support as
346346
Then the maximum memory bandwidth can be obtained using the standard get\_info() interface.
347347

348348
if (dev.has(aspect::ext_intel_max_mem_bandwidth)) {
349-
auto maxBW = dev.get_info<info::device::ext_intel_max_mem_bandwidth>();
349+
auto maxBW = dev.get_info<ext::intel::info::device::max_mem_bandwidth>();
350350
}
351351

352352
# Free Global Memory #
@@ -366,7 +366,7 @@ The extension supports this query in version 4 and later.
366366

367367
| Device Descriptors | Return Type | Description |
368368
| ------------------ | ----------- | ----------- |
369-
| info\:\:device\:\:ext\_intel\_free\_memory | uint64\_t| Returns the memory avialble on the device in units of bytes.|
369+
| ext\:\:intel\:\:info\:\:device\:\:free\_memory | uint64\_t| Returns the memory avialble on the device in units of bytes.|
370370

371371

372372
## Aspects ##
@@ -384,5 +384,21 @@ An invalid object runtime error will be thrown if the device does not support as
384384
Then the free device memory can be obtained using the standard get\_info() interface.
385385

386386
if (dev.has(aspect::ext_intel_free_memory)) {
387-
auto FreeMemory = dev.get_info<info::device::ext_intel_free_memory>();
387+
auto FreeMemory = dev.get_info<ext::intel::info::device::free_memory>();
388388
}
389+
390+
# Deprecated queries #
391+
392+
The table below lists deprecated, that would soon be removed and their replacements:
393+
394+
|Deprecated Descriptors | Replacement Descriptors |
395+
| ------------------------------- |--------------------------- |
396+
| info\:\:device\:\:ext\_intel\_device\_info\_uuid | ext\:\:intel\:\:info\:\:device\:\:uuid |
397+
| info\:\:device\:\:ext\_intel\_pci\_address | ext\:\:intel\:\:info\:\:device\:\:pci\_address |
398+
| info\:\:device\:\:ext\_intel\_gpu\_eu\_simd\_width | ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_simd\_width |
399+
| info\:\:device\:\:ext\_intel\_gpu\__eu\_count | ext\:\:intel\:\:info\:\:device\:\:gpu\__eu\_count |
400+
| info\:\:device\:\:ext\_intel\_gpu\_slices | ext\:\:intel\:\:info\:\:device\:\:gpu\_slices |
401+
| info\:\:device\:\:ext\_intel\_gpu\_subslices\_per\_slice | ext\:\:intel\:\:info\:\:device\:\:gpu\_subslices\_per\_slice |
402+
|info\:\:device\:\:ext\_intel\_gpu\_eu\_count\_per\_subslice | ext\:\:intel\:\:info\:\:device\:\:gpu\_eu\_count\_per\_subslice |
403+
| info\:\:device\:\:ext\_intel\_gpu\_hw\_threads\_per\_eu | ext\:\:intel\:\:info\:\:device\:\:gpu\_hw\_threads\_per\_eu |
404+
| info\:\:device\:\:ext\_intel\_max\_mem\_bandwidth | ext\:\:intel\:\:info\:\:device\:\:max\_mem\_bandwidth |

sycl/include/sycl/detail/info_desc_helpers.hpp

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,21 @@ struct IsSubGroupInfo<info::kernel_device_specific::compile_sub_group_size>
109109
};
110110
#include <sycl/info/device_traits.def>
111111
#undef __SYCL_PARAM_TRAITS_SPEC
112+
113+
#define __SYCL_PARAM_TRAITS_SPEC(Namespace, DescType, Desc, ReturnT, PiCode) \
114+
template <> struct PiInfoCode<Namespace::info::DescType::Desc> { \
115+
static constexpr pi_device_info value = \
116+
static_cast<pi_device_info>(PiCode); \
117+
}; \
118+
template <> \
119+
struct is_##DescType##_info_desc<Namespace::info::DescType::Desc> \
120+
: std::true_type { \
121+
using return_type = Namespace::info::DescType::Desc::return_type; \
122+
};
123+
#include <sycl/info/ext_intel_device_traits.def>
124+
#include <sycl/info/ext_oneapi_device_traits.def>
125+
#undef __SYCL_PARAM_TRAITS_SPEC
126+
112127
} // namespace detail
113128
} // __SYCL_INLINE_NAMESPACE(_V1)
114129
} // namespace sycl

sycl/include/sycl/ext/intel/fpga_device_selector.hpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,8 @@ class platform_selector : public device_selector {
3232

3333
int operator()(const device &device) const override {
3434
const platform &pf = device.get_platform();
35-
const std::string &platform_name = pf.get_info<info::platform::name>();
35+
const std::string &platform_name =
36+
pf.get_info<sycl::info::platform::name>();
3637
if (platform_name == device_platform_name) {
3738
return 10000;
3839
}

0 commit comments

Comments
 (0)