Skip to content

[SYCL] Implement sycl_ext_oneapi_device_architecture on host for Level Zero and OpenCL #9843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jun 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -523,23 +523,12 @@ call to `if_architecture_is` or `else_if_architecture_is` whose condition is
architectures in the `Archs` parameter pack.


== Limitations with the experimental version

The {dpcpp} implementation of this extension currently has some important
limitations. The application must be compiled in ahead-of-time (AOT) mode
using `-fsycl-targets=<special-target>` where `<special-target>` is one of the
"special target values" listed in the link:../../UsersManual.md[users manual]
description of the `-fsycl-targets` option. These are the target names of the
form "intel_gpu_*", "nvidia_gpu_*", or "amd_gpu_*".


== Future direction
=== New member function of `device` class

This experimental extension is still evolving. We expect that future versions
will include the following:
This extension adds the following new member function to the `device` class,
which returns a Boolean telling whether the device has the specified
architecture.

* An extended member function like:
+
--
```
namespace sycl {
Expand All @@ -551,31 +540,56 @@ class device {

// namespace sycl
```

This provides a way to query a device's architecture from host code.
--

* An extended device information descriptor named
`sycl::ext::oneapi::experimental::info::device::architecture`, which returns
the architecture of the device. This allows host code such as:
+
=== New device descriptor

[%header,cols="5,1,5"]
|===
|Device descriptor
|Return type
|Description

|`ext::oneapi::experimental::info::device::architecture`
|`ext::oneapi::experimental::architecture`
|Returns the architecture of the device

|===

This device descriptor allows host code such as:

--
```
using namespace sycl::ext::oneapi::experimental;
namespace syclex = sycl::ext::oneapi::experimental;

architecture arch = dev.get_info<info::device::architecture>();
syclex::architecture arch = dev.get_info<syclex::info::device::architecture>();
switch (arch) {
case architecture::x86_64:
case syclex::architecture::x86_64:
/* ... */
break;
case architecture::intel_gpu_bdw:
case syclex::architecture::intel_gpu_bdw:
/* ... */
break;
/* etc. */
}
```
--

== Limitations with the experimental version

The {dpcpp} implementation of this extension currently has some important
limitations. The application must be compiled in ahead-of-time (AOT) mode
using `-fsycl-targets=<special-target>` where `<special-target>` is one of the
"special target values" listed in the link:../../UsersManual.md[users manual]
description of the `-fsycl-targets` option. These are the target names of the
form "intel_gpu_*", "nvidia_gpu_*", or "amd_gpu_*".


== Future direction

This experimental extension is still evolving. We expect that future versions
will include the following:

* A compile-time constant property that can be used to decorate kernels and
non-kernel device functions:
+
Expand Down
1 change: 1 addition & 0 deletions sycl/include/sycl/detail/pi.h
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,7 @@ typedef enum {
// Intel UUID extension.
PI_DEVICE_INFO_UUID = 0x106A,
// These are Intel-specific extensions.
PI_EXT_ONEAPI_DEVICE_INFO_IP_VERSION = 0x4250,
PI_DEVICE_INFO_DEVICE_ID = 0x4251,
PI_DEVICE_INFO_PCI_ADDRESS = 0x10020,
PI_DEVICE_INFO_GPU_EU_COUNT = 0x10021,
Expand Down
11 changes: 11 additions & 0 deletions sycl/include/sycl/device.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include <sycl/detail/export.hpp>
#include <sycl/detail/info_desc_helpers.hpp>
#include <sycl/detail/owner_less_base.hpp>
#include <sycl/ext/oneapi/experimental/device_architecture.hpp>
#include <sycl/ext/oneapi/weak_object_base.hpp>
#include <sycl/info/info_desc.hpp>
#include <sycl/platform.hpp>
Expand Down Expand Up @@ -240,6 +241,16 @@ class __SYCL_EXPORT device : public detail::OwnerLessBase<device> {
/// \return true if the SYCL device has the given feature.
bool has(aspect Aspect) const __SYCL_WARN_IMAGE_ASPECT(Aspect);

/// Indicates if the SYCL device architecture equals to the one passed to
/// the function.
///
/// \param arch is one of the architectures from architecture enum described
/// in sycl_ext_oneapi_device_architecture specification.
///
/// \return true if the SYCL device architecture equals to the one passed to
/// the function.
bool ext_oneapi_architecture_is(ext::oneapi::experimental::architecture arch);

// TODO: Remove this diagnostics when __SYCL_WARN_IMAGE_ASPECT is removed.
#if defined(__clang__)
#pragma clang diagnostic pop
Expand Down
3 changes: 3 additions & 0 deletions sycl/include/sycl/info/ext_oneapi_device_traits.def
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ __SYCL_PARAM_TRAITS_SPEC(ext::oneapi::experimental,device, max_global_work_group
__SYCL_PARAM_TRAITS_TEMPLATE_SPEC(ext::oneapi::experimental,device, max_work_groups<1>, id<1>, PI_EXT_ONEAPI_DEVICE_INFO_MAX_WORK_GROUPS_1D)
__SYCL_PARAM_TRAITS_TEMPLATE_SPEC(ext::oneapi::experimental,device, max_work_groups<2>, id<2>, PI_EXT_ONEAPI_DEVICE_INFO_MAX_WORK_GROUPS_2D)
__SYCL_PARAM_TRAITS_TEMPLATE_SPEC(ext::oneapi::experimental,device, max_work_groups<3>, id<3>, PI_EXT_ONEAPI_DEVICE_INFO_MAX_WORK_GROUPS_3D)
__SYCL_PARAM_TRAITS_SPEC(ext::oneapi::experimental, device, architecture,
ext::oneapi::experimental::architecture,
PI_EXT_ONEAPI_DEVICE_INFO_IP_VERSION)
#ifdef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF
Expand Down
1 change: 1 addition & 0 deletions sycl/include/sycl/info/info_desc.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include <sycl/aspects.hpp>
#include <sycl/detail/common.hpp>
#include <sycl/detail/pi.hpp>
#include <sycl/ext/oneapi/experimental/device_architecture.hpp>
#include <sycl/id.hpp>

namespace sycl {
Expand Down
3 changes: 3 additions & 0 deletions sycl/plugins/unified_runtime/pi2ur.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1047,6 +1047,9 @@ inline pi_result piDeviceGetInfo(pi_device Device, pi_device_info ParamName,
case PI_DEVICE_INFO_GPU_SUBSLICES_PER_SLICE:
InfoType = UR_DEVICE_INFO_GPU_SUBSLICES_PER_SLICE;
break;
case PI_EXT_ONEAPI_DEVICE_INFO_IP_VERSION:
InfoType = UR_DEVICE_INFO_IP_VERSION;
break;
case PI_DEVICE_INFO_BUILD_ON_SUBDEVICE:
InfoType = UR_DEVICE_INFO_BUILD_ON_SUBDEVICE;
break;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,10 @@ ze_structure_type_t getZeStructureType<ze_device_memory_ext_properties_t>() {
return ZE_STRUCTURE_TYPE_DEVICE_MEMORY_EXT_PROPERTIES;
}
template <>
ze_structure_type_t getZeStructureType<ze_device_ip_version_ext_t>() {
return ZE_STRUCTURE_TYPE_DEVICE_IP_VERSION_EXT;
}
template <>
ze_structure_type_t getZeStructureType<ze_device_memory_access_properties_t>() {
return ZE_STRUCTURE_TYPE_DEVICE_MEMORY_ACCESS_PROPERTIES;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,8 @@ UR_APIEXPORT ur_result_t UR_APICALL urDeviceGetInfo(
uint32_t{1});
case UR_DEVICE_INFO_GLOBAL_MEM_CACHE_SIZE:
return ReturnValue(uint64_t{Device->ZeDeviceCacheProperties->cacheSize});
case UR_DEVICE_INFO_IP_VERSION:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes in L0 files look good.

return ReturnValue(uint32_t{Device->ZeDeviceIpVersionExt->ipVersion});
case UR_DEVICE_INFO_MAX_PARAMETER_SIZE:
return ReturnValue(
size_t{Device->ZeDeviceModuleProperties->maxArgumentsSize});
Expand Down Expand Up @@ -908,6 +910,14 @@ ur_result_t ur_device_handle_t_::initialize(int SubSubDeviceOrdinal,
ZE_CALL_NOCHECK(zeDeviceGetComputeProperties, (ZeDevice, &Properties));
};

ZeDeviceIpVersionExt.Compute =
[ZeDevice](ze_device_ip_version_ext_t &Properties) {
ze_device_properties_t P;
P.stype = ZE_STRUCTURE_TYPE_DEVICE_PROPERTIES;
P.pNext = (void *)&Properties;
ZE_CALL_NOCHECK(zeDeviceGetProperties, (ZeDevice, &P));
};

ZeDeviceImageProperties.Compute =
[ZeDevice](ze_device_image_properties_t &Properties) {
ZE_CALL_NOCHECK(zeDeviceGetImageProperties, (ZeDevice, &Properties));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,4 +168,5 @@ struct ur_device_handle_t_ : _ur_object {
ZeCache<ZeStruct<ze_device_memory_access_properties_t>>
ZeDeviceMemoryAccessProperties;
ZeCache<ZeStruct<ze_device_cache_properties_t>> ZeDeviceCacheProperties;
ZeCache<ZeStruct<ze_device_ip_version_ext_t>> ZeDeviceIpVersionExt;
};
9 changes: 9 additions & 0 deletions sycl/source/detail/device_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -480,6 +480,15 @@ std::string device_impl::getDeviceName() const {
return MDeviceName;
}

ext::oneapi::experimental::architecture device_impl::getDeviceArch() const {
std::call_once(MDeviceArchFlag, [this]() {
MDeviceArch =
get_info<ext::oneapi::experimental::info::device::architecture>();
});

return MDeviceArch;
}

// On first call this function queries for device timestamp
// along with host synchronized timestamp and stores it in memeber varaible
// MDeviceHostBaseTime. Subsequent calls to this function would just retrieve
Expand Down
7 changes: 7 additions & 0 deletions sycl/source/detail/device_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,10 @@ class device_impl {

std::string getDeviceName() const;

bool extOneapiArchitectureIs(ext::oneapi::experimental::architecture Arch) {
return Arch == getDeviceArch();
}

/// Gets the current device timestamp
/// @throw sycl::feature_not_supported if feature is not supported on device
uint64_t getCurrentDeviceTime();
Expand All @@ -253,6 +257,7 @@ class device_impl {
explicit device_impl(pi_native_handle InteropDevice,
sycl::detail::pi::PiDevice Device,
PlatformImplPtr Platform, const PluginPtr &Plugin);
ext::oneapi::experimental::architecture getDeviceArch() const;
sycl::detail::pi::PiDevice MDevice = 0;
sycl::detail::pi::PiDeviceType MType;
sycl::detail::pi::PiDevice MRootDevice = nullptr;
Expand All @@ -261,6 +266,8 @@ class device_impl {
bool MIsAssertFailSupported = false;
mutable std::string MDeviceName;
mutable std::once_flag MDeviceNameFlag;
mutable ext::oneapi::experimental::architecture MDeviceArch;
mutable std::once_flag MDeviceArchFlag;
std::pair<uint64_t, uint64_t> MDeviceHostBaseTime;
}; // class device_impl

Expand Down
84 changes: 84 additions & 0 deletions sycl/source/detail/device_info.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include <sycl/detail/os_util.hpp>
#include <sycl/detail/pi.hpp>
#include <sycl/device.hpp>
#include <sycl/ext/oneapi/experimental/device_architecture.hpp>
#include <sycl/feature_test.hpp>
#include <sycl/info/info_desc.hpp>
#include <sycl/memory_enums.hpp>
Expand Down Expand Up @@ -567,6 +568,83 @@ struct get_device_info_impl<range<Dimensions>,
}
};

template <>
struct get_device_info_impl<
ext::oneapi::experimental::architecture,
ext::oneapi::experimental::info::device::architecture> {
static ext::oneapi::experimental::architecture get(const DeviceImplPtr &Dev) {
using oneapi_exp_arch = sycl::ext::oneapi::experimental::architecture;
auto ReturnHelper = [](auto MapDeviceIpToArch, auto DeviceIp) {
// TODO: use std::map::contains instead of try-catch when SYCL RT be moved
// to C++20
try {
oneapi_exp_arch Result = MapDeviceIpToArch.at(DeviceIp);
return Result;
} catch (std::out_of_range &) {
throw sycl::exception(
make_error_code(errc::runtime),
"The current device architecture is not supported by "
"sycl_ext_oneapi_device_architecture.");
}
};
backend CurrentBackend = Dev->getBackend();
if (Dev->is_gpu() && (backend::ext_oneapi_level_zero == CurrentBackend ||
backend::opencl == CurrentBackend)) {
std::map<uint32_t, oneapi_exp_arch> MapDeviceIpToArch = {
{0x02000000, oneapi_exp_arch::intel_gpu_bdw},
{0x02400009, oneapi_exp_arch::intel_gpu_skl},
{0x02404009, oneapi_exp_arch::intel_gpu_kbl},
{0x02408009, oneapi_exp_arch::intel_gpu_cfl},
{0x0240c000, oneapi_exp_arch::intel_gpu_apl},
{0x02410000, oneapi_exp_arch::intel_gpu_glk},
{0x02414000, oneapi_exp_arch::intel_gpu_whl},
{0x02418000, oneapi_exp_arch::intel_gpu_aml},
{0x0241c000, oneapi_exp_arch::intel_gpu_cml},
{0x02c00000, oneapi_exp_arch::intel_gpu_icllp},
{0x03000000, oneapi_exp_arch::intel_gpu_tgllp},
{0x03004000, oneapi_exp_arch::intel_gpu_rkl},
{0x03008000, oneapi_exp_arch::intel_gpu_adl_s},
{0x03008000, oneapi_exp_arch::intel_gpu_rpl_s},
{0x0300c000, oneapi_exp_arch::intel_gpu_adl_p},
{0x03010000, oneapi_exp_arch::intel_gpu_adl_n},
{0x03028000, oneapi_exp_arch::intel_gpu_dg1},
{0x030dc008, oneapi_exp_arch::intel_gpu_acm_g10},
{0x030e0005, oneapi_exp_arch::intel_gpu_acm_g11},
{0x030e4000, oneapi_exp_arch::intel_gpu_acm_g12},
{0x030f0007, oneapi_exp_arch::intel_gpu_pvc},
};
uint32_t DeviceIp;
Dev->getPlugin()->call<PiApiKind::piDeviceGetInfo>(
Dev->getHandleRef(),
PiInfoCode<
ext::oneapi::experimental::info::device::architecture>::value,
sizeof(DeviceIp), &DeviceIp, nullptr);
return ReturnHelper(MapDeviceIpToArch, DeviceIp);
} else if (Dev->is_cpu() && backend::opencl == CurrentBackend) {
// TODO: add support of different CPU architectures to
// sycl_ext_oneapi_device_architecture
return sycl::ext::oneapi::experimental::architecture::x86_64;
} // else is not needed
// TODO: add support of other arhitectures by extending with else if

// Generating a user-friendly error message
std::string DeviceStr;
if (Dev->is_gpu())
DeviceStr = "GPU";
else if (Dev->is_cpu())
DeviceStr = "CPU";
else if (Dev->is_accelerator())
DeviceStr = "accelerator";
// else if not needed
std::stringstream ErrorMessage;
ErrorMessage
<< "sycl_ext_oneapi_device_architecture feature is not supported on "
<< DeviceStr << " device with sycl::backend::" << CurrentBackend
<< " backend.";
throw sycl::exception(make_error_code(errc::runtime), ErrorMessage.str());
}
};

template <>
struct get_device_info_impl<
size_t, ext::oneapi::experimental::info::device::max_global_work_groups> {
Expand Down Expand Up @@ -826,6 +904,12 @@ inline std::vector<sycl::aspect> get_device_info_host<info::device::aspects>() {
return std::vector<sycl::aspect>();
}

template <>
inline ext::oneapi::experimental::architecture
get_device_info_host<ext::oneapi::experimental::info::device::architecture>() {
return ext::oneapi::experimental::architecture::x86_64;
}

template <>
inline info::device_type get_device_info_host<info::device::device_type>() {
return info::device_type::host;
Expand Down
5 changes: 5 additions & 0 deletions sycl/source/device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -208,5 +208,10 @@ pi_native_handle device::getNative() const { return impl->getNative(); }

bool device::has(aspect Aspect) const { return impl->has(Aspect); }

bool device::ext_oneapi_architecture_is(
ext::oneapi::experimental::architecture arch) {
return impl->extOneapiArchitectureIs(arch);
}

} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// UNSUPPORTED: cuda, hip, esimd_emulator

// Enable this test, when GPU driver on Windows CI machines will be updated
// XFAIL: windows

// RUN: %{build} -o %t.out
// RUN: %{run} %t.out

#include <sycl/sycl.hpp>

int main() {
sycl::queue q;
sycl::device dev = q.get_device();

sycl::ext::oneapi::experimental::architecture arch = dev.get_info<
sycl::ext::oneapi::experimental::info::device::architecture>();

assert(dev.ext_oneapi_architecture_is(arch));

return 0;
}
Loading