-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] [ABI-Break] Partial implementation of sycl_ext_oneapi_cuda_cluster_group #14113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5242112
02001a1
939d01e
1b23bc4
1a15a1c
b26f303
eceb616
b718967
c37f4be
80ed867
9b75ebb
79f9d1f
40c0db4
411aa70
519034d
140b4d7
f95801a
3ff69cf
9651d4f
81c5456
f2adb81
4e82463
3b5ee0a
86a73cd
537d7c5
007ef24
364d1d8
24d13cf
e2ecd3e
1a16024
846f5f1
5cf823f
875038a
ad0adcd
e5015b5
edb3c9d
6f39040
e3fcd1d
f694315
94f6f77
5963b36
aa21ff5
f3a7dfa
db6ed43
71e3336
9863621
3395142
1c35fdd
96c84ca
f0f9bfd
61e3474
2f4ac06
4e1e14f
9eb69ad
0380732
0cf1681
bd85b80
a1b80d5
d26e53f
e0aa8c8
6374f6e
055bbc9
3f909c9
1032723
56acac7
722d29a
af08b2a
f27769e
9d7938a
353f759
aa5a64d
8ffacc3
1f0ba28
4e52601
39f8f5d
86db950
7c86278
43553ea
2cfe979
9ec11de
c6a8ef2
24eb8f2
91bc7af
f11bc5a
8498375
c24fc47
a17f229
8801a6a
5a0b039
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
//==--- cluster_group_prop.hpp --- SYCL extension for cuda cluster group ---==// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#pragma once | ||
|
||
#include <sycl/ext/oneapi/properties/properties.hpp> | ||
#include <sycl/range.hpp> | ||
|
||
namespace sycl { | ||
inline namespace _V1 { | ||
namespace ext::oneapi::experimental { | ||
|
||
namespace cuda { | ||
template <int Dim> | ||
struct cluster_size | ||
: ::sycl::ext::oneapi::experimental::detail::run_time_property_key< | ||
::sycl::ext::oneapi::experimental::detail::ClusterLaunch> { | ||
cluster_size(const range<Dim> &size) : size(size) {} | ||
sycl::range<Dim> get_cluster_size() { return size; } | ||
|
||
private: | ||
range<Dim> size; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This member should maybe be private. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe, but I'm not sure. I doubt that this is very important either way but I didn't design the Soon I will be on holiday and will be unable to continue refactoring it as new abi-break PRs and unified runtime changes are merged, such that it will be at risk of missing the abi-break window at all. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @intel/dpcpp-tools-reviewers This is 100% green now. Please could you review this asap. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @AerialMantis Made the member |
||
}; | ||
|
||
template <int Dim> using cluster_size_key = cluster_size<Dim>; | ||
|
||
} // namespace cuda | ||
|
||
template <> | ||
struct is_property_key<cuda::cluster_size_key<1>> : std::true_type {}; | ||
template <> | ||
struct is_property_key<cuda::cluster_size_key<2>> : std::true_type {}; | ||
template <> | ||
struct is_property_key<cuda::cluster_size_key<3>> : std::true_type {}; | ||
|
||
template <typename T> | ||
struct is_property_key_of<cuda::cluster_size_key<1>, T> : std::true_type {}; | ||
|
||
template <typename T> | ||
struct is_property_key_of<cuda::cluster_size_key<2>, T> : std::true_type {}; | ||
|
||
template <typename T> | ||
struct is_property_key_of<cuda::cluster_size_key<3>, T> : std::true_type {}; | ||
|
||
template <> | ||
struct is_property_value<cuda::cluster_size_key<1>> | ||
: is_property_key<cuda::cluster_size_key<1>> {}; | ||
template <> | ||
struct is_property_value<cuda::cluster_size_key<2>> | ||
: is_property_key<cuda::cluster_size_key<2>> {}; | ||
template <> | ||
struct is_property_value<cuda::cluster_size_key<3>> | ||
: is_property_key<cuda::cluster_size_key<3>> {}; | ||
|
||
template <typename O> | ||
struct is_property_value_of<cuda::cluster_size_key<1>, O> | ||
: is_property_key_of<cuda::cluster_size_key<1>, O> {}; | ||
|
||
template <typename O> | ||
struct is_property_value_of<cuda::cluster_size_key<2>, O> | ||
: is_property_key_of<cuda::cluster_size_key<2>, O> {}; | ||
|
||
template <typename O> | ||
struct is_property_value_of<cuda::cluster_size_key<3>, O> | ||
: is_property_key_of<cuda::cluster_size_key<3>, O> {}; | ||
|
||
namespace detail { | ||
template <typename PropertiesT> constexpr std::size_t getClusterDim() { | ||
if constexpr (PropertiesT::template has_property< | ||
sycl::ext::oneapi::experimental::cuda::cluster_size_key< | ||
1>>()) { | ||
return 1; | ||
} | ||
if constexpr (PropertiesT::template has_property< | ||
sycl::ext::oneapi::experimental::cuda::cluster_size_key< | ||
2>>()) { | ||
return 2; | ||
} | ||
if constexpr (PropertiesT::template has_property< | ||
sycl::ext::oneapi::experimental::cuda::cluster_size_key< | ||
3>>()) { | ||
return 3; | ||
} | ||
return 0; | ||
} | ||
} // namespace detail | ||
} // namespace ext::oneapi::experimental | ||
} // namespace _V1 | ||
} // namespace sycl |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -149,9 +149,9 @@ void parallel_for(handler &CGH, | |
ext::oneapi::experimental::detail::LaunchConfigAccess<range<Dimensions>, | ||
Properties> | ||
ConfigAccess(Config); | ||
CGH.parallel_for<KernelName>(ConfigAccess.getRange(), | ||
std::forward<ReductionsT>(Reductions)..., | ||
KernelObj); | ||
CGH.parallel_for<KernelName>( | ||
ConfigAccess.getRange(), ConfigAccess.getProperties(), | ||
std::forward<ReductionsT>(Reductions)..., KernelObj); | ||
Comment on lines
+152
to
+154
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @steffenlarsen , I was wondering if you could share your insight here. For some context, This PR is a partial implementation of the sycl_ext_oneapi_cuda_cluster_group. This introduces a new runtime launch property, called as My question here is, I see Thanks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @AD2605! The issue you're seeing stems from the fact that the extension that adds properties to Eventually we will need it for passing properties with runtime information, like in the work_group_specific extension, which will soon be renamed to work_group_static. I suggest you omit the changes to line 258 for now and if you need a runtime-value property passed down to it, the functionality can be added in a follow up, with related extension changes. That assumes this extension doesn't need it immediately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, thanks a Lot, Well since we do not need the kernel bundle support for now, I can simply remove it in the mean time. Thanks |
||
} | ||
|
||
template <typename KernelName = sycl::detail::auto_name, int Dimensions, | ||
|
@@ -225,9 +225,9 @@ void nd_launch(handler &CGH, | |
ext::oneapi::experimental::detail::LaunchConfigAccess<nd_range<Dimensions>, | ||
Properties> | ||
ConfigAccess(Config); | ||
CGH.parallel_for<KernelName>(ConfigAccess.getRange(), | ||
std::forward<ReductionsT>(Reductions)..., | ||
KernelObj); | ||
CGH.parallel_for<KernelName>( | ||
ConfigAccess.getRange(), ConfigAccess.getProperties(), | ||
std::forward<ReductionsT>(Reductions)..., KernelObj); | ||
} | ||
|
||
template <typename KernelName = sycl::detail::auto_name, int Dimensions, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an ABI break, as suggested by layout_handler.cpp and symbol_size_alignment.cpp. If we want to roll with that, this PR should be marked [ABI-break] and the corresponding label should be added.
Alternatively, you could change
MKernelUsesClusterLaunch
inhandler_impl
to be astd::optional<sycl::range<3>>
and pass that along. We have had some issues with passingstd::optional
across the library boundary before, but as long as it stays inside the source files, it should not be a problem.I am of two minds, because on one hand this seems like a fitting place for the new information and ABI-breaks are allowed. On the other hand, changing the layout of handler is exactly what
handler_impl
is here to prevent. Maybe a better solution is to make another ABI-break by movingNDRDescT
out ofhandler
and intohandler_impl
. @aelovikov-intel - Thoughts?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that your logic is sensible, moving to handler_impl might be the best option. @aelovikov-intel I'd also appreciate your input. Thanks
We focused on getting a reasonable implementation of this feature up to collect feedback, particularly on the scheduling/handler details from Intel developers, and this was one of the main points that we foresaw could be challenging/contentious.
I think it would be a good idea to focus on this point and get it right first time since it is an abi-break, and really we only have until the end of next week to solve this and get it merged, since both I and @AD2605 are on holiday after that and won't be back until the ABI-break window is over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made a patch for moving more
detail
parts into sources and moving some of the handler members into thehandler_impl
: #14460If that is the way we want to go, I would be okay with merging this as-is and moving the new changes as part of the aforementioned patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.