Skip to content

[SYCL][Docs] Add kernel enqueue functions for kernel and properties #14707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -121,10 +121,7 @@ Section 5.8.1 of the SYCL 2020 specification. Note that deprecated attributes
(such as `vec_type_hint`) are not included.

```c++
namespace sycl {
namespace ext {
namespace oneapi {
namespace experimental {
namespace sycl::ext::oneapi::experimental {

// Corresponds to reqd_work_group_size
struct work_group_size_key {
Expand Down Expand Up @@ -185,10 +182,7 @@ template <> struct is_property_key<work_group_size_hint_key> : std::true_type {}
template <> struct is_property_key<sub_group_size_key> : std::true_type {};
template <> struct is_property_key<device_has_key> : std::true_type {};

} // namespace experimental
} // namespace oneapi
} // namespace ext
} // namespace sycl
} // namespace sycl::ext::oneapi::experimental
```

|===
Expand Down Expand Up @@ -320,6 +314,23 @@ class handler {
range<dimensions> workGroupSize,
PropertyList properties,
const WorkgroupFunctionType &kernelFunc);

// Available only if all properties in `PropertList` only have launch-related
// effects.
template <typename PropertyList>
void single_task(PropertyList properties, const kernel& kernelObject);

// Available only if all properties in `PropertList` only have launch-related
// effects.
template <int Dimensions, typename PropertyList>
void parallel_for(range<Dimensions> numWorkItems, PropertyList properties,
const kernel& kernelObject);

// Available only if all properties in `PropertList` only have launch-related
// effects.
template <int Dimensions, typename PropertyList>
void parallel_for(nd_range<Dimensions> ndRange, PropertyList properties,
const kernel& kernelObject);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if we should invest effort into extending these forms of single_task and parallel_for. The recently implemented sycl_ext_oneapi_enqueue_functions has a cleaner separation between launch properties vs. kernel properties.

I think our longer term strategy may be to drop the forms of single_task and parallel_for in sycl_ext_oneapi_kernel_properties and use the ones in sycl_ext_oneapi_enqueue_functions instead.

@Pennycook what were your thoughts on these two extension regarding ways of specifying kernel properties?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our longer term strategy may be to drop the forms of single_task and parallel_for in sycl_ext_oneapi_kernel_properties and use the ones in sycl_ext_oneapi_enqueue_functions instead.

I agree with this.

sycl_ext_oneapi_kernel_properties was essentially our first attempt at defining a kernel launch interface that accepted properties. The sycl_ext_oneapi_enqueue_functions design addresses a lot of feedback we received from users and other implementers, and it's much more aligned with where we expect future versions of SYCL to go.

My preference would be that we move to sycl_ext_oneapi_enqueue_functions as soon as possible, and deprecate these property overloads. If there's a short-term need to expose these overloads with minimal effort, I'm not opposed, but we should figure out an implementation plan for the new extension.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a short-term need to expose these overloads with minimal effort, I'm not opposed, but we should figure out an implementation plan for the new extension.

I think sycl_ext_oneapi_enqueue_functions is already implemented.

@steffenlarsen I'm not sure who is asking for this change. Can we ask them to use sycl_ext_oneapi_enqueue_functions instead of making the changes in this PR?

}
}
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,8 @@ using the mechanism defined in sycl_ext_oneapi_kernel_properties.
=== Kernel Interface Properties

```c++
namespace sycl::ext::intel::experimental {
namespace sycl::ext {
namespace intel::experimental {

enum class streaming_interface_options_enum {
accept_downstream_stall,
Expand Down Expand Up @@ -167,11 +168,16 @@ inline constexpr fpga_cluster_key::value_t<
inline constexpr fpga_cluster_key::value_t<
fpga_cluster_options_enum::stall_free_clusters> stall_free_clusters;

template <> struct is_property_key<streaming_interface_key> : std::true_type {};
template <> struct is_property_key<register_map_interface_key> : std::true_type {};
template <> struct is_property_key<fpga_cluster_key> : std::true_type {};
} // intel::experimental

} // namespace sycl::ext::intel::experimental
namespace oneapi::experimental {

template <> struct is_property_key<intel::experimental::streaming_interface_key> : std::true_type {};
template <> struct is_property_key<intel::experimental::register_map_interface_key> : std::true_type {};
template <> struct is_property_key<intel::experimental::fpga_cluster_key> : std::true_type {};

} // intel::experimental
} // namespace sycl::ext
```

|===
Expand Down
1 change: 1 addition & 0 deletions sycl/include/sycl/detail/kernel_properties.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ struct PropertyMetaInfo<sycl::detail::register_alloc_mode_key::value_t<Mode>> {
static constexpr const char *name = "sycl-register-alloc-mode";
static constexpr sycl::detail::register_alloc_mode_enum value = Mode;
};

} // namespace detail
} // namespace ext::oneapi::experimental
} // namespace _V1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,15 @@ struct is_property_key_of<
: std::true_type {};

namespace detail {
template <intel::experimental::streaming_interface_options_enum option>
struct HasCompileTimeEffect<
intel::experimental::streaming_interface_key::value_t<option>>
: std::true_type {};
template <intel::experimental::register_map_interface_options_enum option>
struct HasCompileTimeEffect<
intel::experimental::register_map_interface_key::value_t<option>>
: std::true_type {};

template <intel::experimental::streaming_interface_options_enum Stall_Free>
struct PropertyMetaInfo<
intel::experimental::streaming_interface_key::value_t<Stall_Free>> {
Expand Down
13 changes: 13 additions & 0 deletions sycl/include/sycl/ext/oneapi/kernel_properties/properties.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,19 @@ template <> struct is_property_key<work_item_progress_key> : std::true_type {};

namespace detail {

template <size_t... Dims>
struct HasCompileTimeEffect<work_group_size_key::value_t<Dims...>>
: std::true_type {};
template <size_t... Dims>
struct HasCompileTimeEffect<work_group_size_hint_key::value_t<Dims...>>
: std::true_type {};
template <uint32_t Size>
struct HasCompileTimeEffect<sub_group_size_key::value_t<Size>>
: std::true_type {};
template <sycl::aspect... Aspects>
struct HasCompileTimeEffect<device_has_key::value_t<Aspects...>>
: std::true_type {};

template <size_t Dim0, size_t... Dims>
struct PropertyMetaInfo<work_group_size_key::value_t<Dim0, Dims...>> {
static constexpr const char *name = "sycl-work-group-size";
Expand Down
2 changes: 2 additions & 0 deletions sycl/include/sycl/ext/oneapi/properties/property.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,8 @@ template <typename PropertyT> struct PropertyMetaInfo {
static constexpr std::nullptr_t value = nullptr;
};

template <typename> struct HasCompileTimeEffect : std::false_type {};

} // namespace detail

template <typename T>
Expand Down
187 changes: 144 additions & 43 deletions sycl/include/sycl/handler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,18 @@ struct GetMergedKernelProperties<
PropertiesT, get_method_properties>;
};

// Checks that none of the properties in the property list has compile-time
// effects on the kernel.
template <typename T>
struct NoPropertyHasCompileTimeKernelEffect : std::false_type {};
template <typename... Ts>
struct NoPropertyHasCompileTimeKernelEffect<
ext::oneapi::experimental::detail::properties_t<Ts...>> {
static constexpr bool value =
!(ext::oneapi::experimental::detail::HasCompileTimeEffect<Ts>::value ||
... || false);
};

#if __SYCL_ID_QUERIES_FIT_IN_INT__
template <typename T, typename ValT>
typename std::enable_if_t<std::is_same<ValT, size_t>::value ||
Expand Down Expand Up @@ -970,28 +982,11 @@ class __SYCL_EXPORT handler {
}
}

/// Process kernel properties.
/// Process runtime kernel properties.
///
/// Stores information about kernel properties into the handler.
template <
typename KernelName,
typename PropertiesT = ext::oneapi::experimental::empty_properties_t>
void processProperties(PropertiesT Props) {
using KI = detail::KernelInfo<KernelName>;
static_assert(
ext::oneapi::experimental::is_property_list<PropertiesT>::value,
"Template type is not a property list.");
static_assert(
!PropertiesT::template has_property<
sycl::ext::intel::experimental::fp_control_key>() ||
(PropertiesT::template has_property<
sycl::ext::intel::experimental::fp_control_key>() &&
KI::isESIMD()),
"Floating point control property is supported for ESIMD kernels only.");
static_assert(
!PropertiesT::template has_property<
sycl::ext::oneapi::experimental::indirectly_callable_key>(),
"indirectly_callable property cannot be applied to SYCL kernels");
template <typename PropertiesT>
void processLaunchProperties(PropertiesT Props) {
if constexpr (PropertiesT::template has_property<
sycl::ext::intel::experimental::cache_config_key>()) {
auto Config = Props.template get_property<
Expand Down Expand Up @@ -1042,6 +1037,32 @@ class __SYCL_EXPORT handler {
checkAndSetClusterRange(Props);
}

/// Process kernel properties.
///
/// Stores information about kernel properties into the handler.
template <
typename KernelName,
typename PropertiesT = ext::oneapi::experimental::empty_properties_t>
void processProperties(PropertiesT Props) {
using KI = detail::KernelInfo<KernelName>;
static_assert(
ext::oneapi::experimental::is_property_list<PropertiesT>::value,
"Template type is not a property list.");
static_assert(
!PropertiesT::template has_property<
sycl::ext::intel::experimental::fp_control_key>() ||
(PropertiesT::template has_property<
sycl::ext::intel::experimental::fp_control_key>() &&
KI::isESIMD()),
"Floating point control property is supported for ESIMD kernels only.");
static_assert(
!PropertiesT::template has_property<
sycl::ext::oneapi::experimental::indirectly_callable_key>(),
"indirectly_callable property cannot be applied to SYCL kernels");

processLaunchProperties(Props);
}

/// Checks whether it is possible to copy the source shape to the destination
/// shape(the shapes are described by the accessor ranges) by using
/// copying by regions of memory and not copying element by element
Expand Down Expand Up @@ -1440,12 +1461,15 @@ class __SYCL_EXPORT handler {
///
/// \param NumWorkItems is a range defining indexing space.
/// \param Kernel is a SYCL kernel function.
template <int Dims>
void parallel_for_impl(range<Dims> NumWorkItems, kernel Kernel) {
/// \param Properties is the properties.
template <int Dims, typename PropertiesT>
void parallel_for_impl(range<Dims> NumWorkItems, PropertiesT Props,
kernel Kernel) {
throwIfActionIsCreated();
MKernel = detail::getSyclObjImpl(std::move(Kernel));
detail::checkValueRange<Dims>(NumWorkItems);
setNDRangeDescriptor(std::move(NumWorkItems));
processLaunchProperties<PropertiesT>(Props);
setType(detail::CGType::Kernel);
setNDRangeUsed(false);
extractArgsAndReqs();
Expand Down Expand Up @@ -2125,28 +2149,22 @@ class __SYCL_EXPORT handler {
///
/// \param Kernel is a SYCL kernel object.
void single_task(kernel Kernel) {
throwIfActionIsCreated();
// Ignore any set kernel bundles and use the one associated with the kernel
setHandlerKernelBundle(Kernel);
// No need to check if range is out of INT_MAX limits as it's compile-time
// known constant
setNDRangeDescriptor(range<1>{1});
MKernel = detail::getSyclObjImpl(std::move(Kernel));
setType(detail::CGType::Kernel);
extractArgsAndReqs();
MKernelName = getKernelName();
single_task(ext::oneapi::experimental::empty_properties_t{}, Kernel);
}

void parallel_for(range<1> NumWorkItems, kernel Kernel) {
parallel_for_impl(NumWorkItems, Kernel);
parallel_for_impl(NumWorkItems,
ext::oneapi::experimental::empty_properties_t{}, Kernel);
}

void parallel_for(range<2> NumWorkItems, kernel Kernel) {
parallel_for_impl(NumWorkItems, Kernel);
parallel_for_impl(NumWorkItems,
ext::oneapi::experimental::empty_properties_t{}, Kernel);
}

void parallel_for(range<3> NumWorkItems, kernel Kernel) {
parallel_for_impl(NumWorkItems, Kernel);
parallel_for_impl(NumWorkItems,
ext::oneapi::experimental::empty_properties_t{}, Kernel);
}

/// Defines and invokes a SYCL kernel function for the specified range and
Expand Down Expand Up @@ -2180,14 +2198,8 @@ class __SYCL_EXPORT handler {
/// well as offset.
/// \param Kernel is a SYCL kernel function.
template <int Dims> void parallel_for(nd_range<Dims> NDRange, kernel Kernel) {
throwIfActionIsCreated();
MKernel = detail::getSyclObjImpl(std::move(Kernel));
detail::checkValueRange<Dims>(NDRange);
setNDRangeDescriptor(std::move(NDRange));
setType(detail::CGType::Kernel);
setNDRangeUsed(true);
extractArgsAndReqs();
MKernelName = getKernelName();
parallel_for(NDRange, ext::oneapi::experimental::empty_properties_t{},
Kernel);
}

/// Defines and invokes a SYCL kernel function.
Expand Down Expand Up @@ -2573,6 +2585,95 @@ class __SYCL_EXPORT handler {
NumWorkGroups, WorkGroupSize, Props, KernelFunc);
}

/// Invokes a SYCL kernel.
///
/// Executes exactly once. The kernel invocation method has no functors and
/// cannot be called on host.
///
/// \param Kernel is a SYCL kernel object.
/// \param Props is the properties for the launch.
template <typename PropertiesT>
std::enable_if_t<ext::oneapi::experimental::is_property_list_v<PropertiesT>,
void>
single_task(PropertiesT Props, kernel Kernel) {
static_assert(
detail::NoPropertyHasCompileTimeKernelEffect<PropertiesT>::value,
"This kernel enqueue function does not allow properties with "
"compile-time kernel effects.");
throwIfActionIsCreated();
// Ignore any set kernel bundles and use the one associated with the kernel
setHandlerKernelBundle(Kernel);
// No need to check if range is out of INT_MAX limits as it's compile-time
// known constant
setNDRangeDescriptor(range<1>{1});
processLaunchProperties(Props);
MKernel = detail::getSyclObjImpl(std::move(Kernel));
setType(detail::CGType::Kernel);
extractArgsAndReqs();
MKernelName = getKernelName();
}

template <typename PropertiesT>
std::enable_if_t<ext::oneapi::experimental::is_property_list_v<PropertiesT>,
void>
parallel_for(range<1> NumWorkItems, PropertiesT Props, kernel Kernel) {
static_assert(
detail::NoPropertyHasCompileTimeKernelEffect<PropertiesT>::value,
"This kernel enqueue function does not allow properties with "
"compile-time kernel effects.");
parallel_for_impl(NumWorkItems, Props, Kernel);
}

template <typename PropertiesT>
std::enable_if_t<ext::oneapi::experimental::is_property_list_v<PropertiesT>,
void>
parallel_for(range<2> NumWorkItems, PropertiesT Props, kernel Kernel) {
static_assert(
detail::NoPropertyHasCompileTimeKernelEffect<PropertiesT>::value,
"This kernel enqueue function does not allow properties with "
"compile-time kernel effects.");
parallel_for_impl(NumWorkItems, Props, Kernel);
}

template <typename PropertiesT>
std::enable_if_t<ext::oneapi::experimental::is_property_list_v<PropertiesT>,
void>
parallel_for(range<3> NumWorkItems, PropertiesT Props, kernel Kernel) {
static_assert(
detail::NoPropertyHasCompileTimeKernelEffect<PropertiesT>::value,
"This kernel enqueue function does not allow properties with "
"compile-time kernel effects.");
parallel_for_impl(NumWorkItems, Props, Kernel);
}

/// Defines and invokes a SYCL kernel function for the specified range and
/// offsets.
///
/// The SYCL kernel function is defined as SYCL kernel object.
///
/// \param NDRange is a ND-range defining global and local sizes as
/// well as offset.
/// \param Props is the properties for the launch.
/// \param Kernel is a SYCL kernel function.
template <int Dims, typename PropertiesT>
std::enable_if_t<ext::oneapi::experimental::is_property_list_v<PropertiesT>,
void>
parallel_for(nd_range<Dims> NDRange, PropertiesT Props, kernel Kernel) {
static_assert(
detail::NoPropertyHasCompileTimeKernelEffect<PropertiesT>::value,
"This kernel enqueue function does not allow properties with "
"compile-time kernel effects.");
throwIfActionIsCreated();
MKernel = detail::getSyclObjImpl(std::move(Kernel));
detail::checkValueRange<Dims>(NDRange);
setNDRangeDescriptor(std::move(NDRange));
processLaunchProperties(Props);
setType(detail::CGType::Kernel);
setNDRangeUsed(true);
extractArgsAndReqs();
MKernelName = getKernelName();
}

// Clean up KERNELFUNC macro.
#undef _KERNELFUNCPARAM

Expand Down
Loading
Loading