Skip to content

Commit 1ce5b05

Browse files
author
iclsrc
committed
Merge from 'sycl' to 'sycl-web'
2 parents 0872db4 + ee211b6 commit 1ce5b05

File tree

1,470 files changed

+928264
-392
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,470 files changed

+928264
-392
lines changed

clang/lib/Sema/SemaDeclAttr.cpp

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8032,10 +8032,12 @@ void Sema::CheckSYCLAddIRAttributesFunctionAttrConflicts(Decl *D) {
80328032
// "sycl-single-task" is present on all single_task invocations, implicitly
80338033
// added by the SYCL headers. It can only conflict with max_global_work_dim,
80348034
// but the value will be the same so there is no need for a warning.
8035-
if (NumArgsWithoutFilter == 2 &&
8036-
AddIRFuncAttr->getAttributeNameValuePairs(Context)[0].first ==
8037-
"sycl-single-task")
8038-
return;
8035+
if (NumArgsWithoutFilter == 2) {
8036+
auto NameValuePairs = AddIRFuncAttr->getAttributeNameValuePairs(Context);
8037+
if (NameValuePairs.size() > 0 &&
8038+
NameValuePairs[0].first == "sycl-single-task")
8039+
return;
8040+
}
80398041

80408042
// If there are potentially conflicting attributes, we issue a warning.
80418043
for (const auto *Attr : std::vector<AttributeCommonInfo *>{
Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
= sycl_ext_intel_cache_config
2+
:source-highlighter: coderay
3+
:coderay-linenums-mode: table
4+
5+
// This section needs to be after the document title.
6+
:doctype: book
7+
:toc2:
8+
:toc: left
9+
:encoding: utf-8
10+
:lang: en
11+
:dpcpp: pass:[DPC++]
12+
13+
:blank: pass:[ +]
14+
15+
// Set the default source code type in this document to C++,
16+
// for syntax highlighting purposes. This is needed because
17+
// docbook uses c++ and html5 uses cpp.
18+
:language: {basebackend@docbook:c++:cpp}
19+
20+
== Notice
21+
22+
[%hardbreaks]
23+
Copyright (c) 2023-2023 Intel Corporation. All rights reserved.
24+
25+
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
26+
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
27+
permission by Khronos.
28+
29+
== Contact
30+
31+
To report problems with this extension, please open a new issue at:
32+
https://github.com/intel/llvm/issues
33+
34+
== Contributors
35+
36+
Greg Lueck, Intel +
37+
John Pennycook, Intel +
38+
Artur Gainullin, Intel
39+
40+
== Dependencies
41+
42+
This extension is written against the SYCL 2020 specification, Revision 6 and
43+
the following extensions:
44+
45+
- link:../experimental/sycl_ext_oneapi_properties.asciidoc[sycl_ext_oneapi_properties]
46+
- link:sycl_ext_oneapi_kernel_properties.asciidoc[sycl_ext_oneapi_kernel_properties]
47+
48+
== Status
49+
50+
This is an experimental extension specification, intended to provide early
51+
access to features and gather community feedback. Interfaces defined in this
52+
specification are implemented in {dpcpp}, but they are not finalized and may
53+
change incompatibly in future versions of {dpcpp} without prior notice.
54+
*Shipping software products should not rely on APIs defined in this
55+
specification.*
56+
57+
== Overview
58+
59+
There are devices where the same hardware resources are used for shared local
60+
memory (SLM) and L1 data cache. Developers may want more L1 data cache or SLM based
61+
on their application. This extension adds runtime kernel property `cache_config`
62+
which provides a way to set the preferred cache configuration for a kernel.
63+
64+
=== Feature Test Macro
65+
66+
This extension provides a feature-test macro as described in the core SYCL
67+
specification section 6.3.3 "Feature test macros". Therefore, an
68+
implementation supporting this extension must predefine the macro
69+
`SYCL_EXT_INTEL_CACHE_CONFIG` to one of the values defined
70+
in the table below. Applications can test for the existence of this macro to
71+
determine if the implementation supports this feature, or applications can test
72+
the macro's value to determine which of the extension's APIs the implementation
73+
supports.
74+
75+
[%header,cols="1,5"]
76+
|===
77+
|Value |Description
78+
|1 |Initial extension version. Base features are supported.
79+
|===
80+
81+
=== Introduction
82+
83+
This extension introduces new kernel property that can be applied to kernels
84+
using the mechanism defined in link:sycl_ext_oneapi_kernel_properties.asciidoc[sycl_ext_oneapi_kernel_properties].
85+
86+
=== Cache Config Property
87+
88+
```c++
89+
namespace sycl::ext::intel::experimental {
90+
91+
92+
enum class cache_config_enum : /*unspecified*/ { large_slm, large_data };
93+
94+
inline constexpr cache_config_enum large_slm =
95+
cache_config_enum::large_slm;
96+
inline constexpr cache_config_enum large_data =
97+
cache_config_enum::large_data;
98+
99+
struct cache_config {
100+
cache_config(cache_config_enum v) : value(v) {}
101+
cache_config_enum value;
102+
};
103+
104+
} // namespace sycl::ext::intel::experimental
105+
```
106+
107+
The `cache_config` property provides a way to set the preferred cache
108+
configuration for a kernel. The following values are supported:
109+
110+
* `large_slm`: Prefer having larger shared local memory and smaller L1 data cache.
111+
In this case driver will ensure that all workgroups will have enough
112+
SLM to run.
113+
114+
* `large_data`: Prefer having larger L1 data cache and smaller shared local memory.
115+
In this case SLM size may be shrinked (which may result in workgroups
116+
spawning as there will be not enough SLM to handle multiple workgroups)
117+
but L1 data cache will be bigger. There may be rare use cases when this
118+
is beneficial.
119+
120+
These property may be passed to any kernel invocation function (e.g.
121+
`parallel_for`) via the properties parameter. At most, only one of these
122+
values may be passed to any single kernel invocation function.
123+
124+
Backends that do not support this extension may accept and ignore this
125+
property.
126+
127+
=== Adding a Property List to a Kernel Launch
128+
129+
A simple example of using this extension is shown below.
130+
131+
The example assumes that the kernel will benefit from large SLM and hence uses the property
132+
`cache_config_large_slm`:
133+
134+
```c++
135+
using namespace sycl::ext::intel::experimental;
136+
{
137+
...
138+
properties kernel_properties{cache_config{large_slm}};
139+
140+
q.single_task(kernel_properties, [=] {
141+
*a = *b + *c;
142+
}).wait();
143+
}
144+
```
145+
146+
=== Embedding Property into a Kernel
147+
148+
The example below shows how the kernel from the previous section could be
149+
rewritten to leverage an embedded property list (see link:sycl_ext_oneapi_kernel_properties.asciidoc#embedding-properties-into-a-kernel[embedding-properties-into-a-kernel]):
150+
151+
```c++
152+
using namespace sycl::ext::intel::experimental;
153+
struct KernelFunctor {
154+
155+
KernelFunctor(int* a, int* b, int* c) : a(a), b(b), c(c) {}
156+
157+
void operator()() const {
158+
*a = *b + *c;
159+
}
160+
161+
auto get(properties_tag) const {
162+
return properties{cache_config{large_slm}};
163+
}
164+
165+
int* a;
166+
int* b;
167+
int* c;
168+
};
169+
170+
...
171+
172+
q.single_task(KernelFunctor{a, b, c}).wait();
173+
```
174+
175+
== Revision History
176+
177+
[cols="5,15,15,70"]
178+
[grid="rows"]
179+
[options="header"]
180+
|========================================
181+
|Rev|Date|Author|Changes
182+
|1|2022-03-01|Artur Gainullin|*Initial public working draft*
183+
|========================================

sycl/include/sycl/detail/cg.hpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ class CGExecKernel : public CG {
144144
detail::OSModuleHandle MOSModuleHandle;
145145
std::vector<std::shared_ptr<detail::stream_impl>> MStreams;
146146
std::vector<std::shared_ptr<const void>> MAuxiliaryResources;
147+
RT::PiKernelCacheConfig MKernelCacheConfig;
147148

148149
CGExecKernel(NDRDescT NDRDesc, std::unique_ptr<HostKernelBase> HKernel,
149150
std::shared_ptr<detail::kernel_impl> SyclKernel,
@@ -157,7 +158,8 @@ class CGExecKernel : public CG {
157158
detail::OSModuleHandle OSModuleHandle,
158159
std::vector<std::shared_ptr<detail::stream_impl>> Streams,
159160
std::vector<std::shared_ptr<const void>> AuxiliaryResources,
160-
CGTYPE Type, detail::code_location loc = {})
161+
CGTYPE Type, RT::PiKernelCacheConfig KernelCacheConfig,
162+
detail::code_location loc = {})
161163
: CG(Type, std::move(ArgsStorage), std::move(AccStorage),
162164
std::move(SharedPtrStorage), std::move(Requirements),
163165
std::move(Events), std::move(loc)),
@@ -166,7 +168,8 @@ class CGExecKernel : public CG {
166168
MKernelBundle(std::move(KernelBundle)), MArgs(std::move(Args)),
167169
MKernelName(std::move(KernelName)), MOSModuleHandle(OSModuleHandle),
168170
MStreams(std::move(Streams)),
169-
MAuxiliaryResources(std::move(AuxiliaryResources)) {
171+
MAuxiliaryResources(std::move(AuxiliaryResources)),
172+
MKernelCacheConfig(std::move(KernelCacheConfig)) {
170173
assert((getType() == RunOnHostIntel || getType() == Kernel) &&
171174
"Wrong type of exec kernel CG.");
172175
}

sycl/include/sycl/detail/pi.h

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,9 +77,12 @@
7777
// 12.22 Add piGetDeviceAndHostTimer to query device wall-clock timestamp
7878
// 12.23 Added new piextEnqueueDeviceGlobalVariableWrite and
7979
// piextEnqueueDeviceGlobalVariableRead functions.
80+
// 12.24 Added new PI_EXT_KERNEL_EXEC_INFO_CACHE_CONFIG property to the
81+
// _pi_kernel_exec_info. Defined _pi_kernel_cache_config enum with values of
82+
// the new PI_EXT_KERNEL_EXEC_INFO_CACHE_CONFIG property.
8083

8184
#define _PI_H_VERSION_MAJOR 12
82-
#define _PI_H_VERSION_MINOR 23
85+
#define _PI_H_VERSION_MINOR 24
8386

8487
#define _PI_STRING_HELPER(a) #a
8588
#define _PI_CONCAT(a, b) _PI_STRING_HELPER(a.b)
@@ -621,6 +624,15 @@ constexpr pi_queue_properties PI_EXT_ONEAPI_QUEUE_FLAG_PRIORITY_LOW = (1 << 5);
621624
constexpr pi_queue_properties PI_EXT_ONEAPI_QUEUE_FLAG_PRIORITY_HIGH = (1 << 6);
622625
// clang-format on
623626

627+
typedef enum {
628+
// No preference for SLM or data cache.
629+
PI_EXT_KERNEL_EXEC_INFO_CACHE_DEFAULT = 0x0,
630+
// Large SLM size.
631+
PI_EXT_KERNEL_EXEC_INFO_CACHE_LARGE_SLM = 0x1,
632+
// Large General Data size.
633+
PI_EXT_KERNEL_EXEC_INFO_CACHE_LARGE_DATA = 0x2
634+
} _pi_kernel_cache_config;
635+
624636
using pi_result = _pi_result;
625637
using pi_platform_info = _pi_platform_info;
626638
using pi_device_type = _pi_device_type;
@@ -650,6 +662,7 @@ using pi_program_build_status = _pi_program_build_status;
650662
using pi_program_binary_type = _pi_program_binary_type;
651663
using pi_kernel_info = _pi_kernel_info;
652664
using pi_profiling_info = _pi_profiling_info;
665+
using pi_kernel_cache_config = _pi_kernel_cache_config;
653666

654667
// For compatibility with OpenCL define this not as enum.
655668
using pi_device_partition_property = intptr_t;
@@ -1357,7 +1370,9 @@ typedef enum {
13571370
/// indicates that the kernel might access data through USM ptrs
13581371
PI_USM_INDIRECT_ACCESS,
13591372
/// provides an explicit list of pointers that the kernel will access
1360-
PI_USM_PTRS = 0x4203
1373+
PI_USM_PTRS = 0x4203,
1374+
/// provides the preferred cache configuration (large slm or large data)
1375+
PI_EXT_KERNEL_EXEC_INFO_CACHE_CONFIG = 0x4204
13611376
} _pi_kernel_exec_info;
13621377

13631378
using pi_kernel_exec_info = _pi_kernel_exec_info;

sycl/include/sycl/detail/pi.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,7 @@ using PiMemImageInfo = ::pi_image_info;
145145
using PiMemObjectType = ::pi_mem_type;
146146
using PiMemImageChannelOrder = ::pi_image_channel_order;
147147
using PiMemImageChannelType = ::pi_image_channel_type;
148+
using PiKernelCacheConfig = ::pi_kernel_cache_config;
148149

149150
__SYCL_EXPORT void contextSetExtendedDeleter(const sycl::context &constext,
150151
pi_context_extended_deleter func,
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
//==----- gpu_kernel_properties.hpp - Kernel properties for Intel GPUs ---==//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===--------------------------------------------------------------------===//
8+
9+
#pragma once
10+
11+
#include <sycl/ext/oneapi/properties/property.hpp>
12+
13+
namespace sycl {
14+
__SYCL_INLINE_VER_NAMESPACE(_V1) {
15+
namespace ext::intel::experimental {
16+
17+
template <typename T, typename PropertyListT> class gpu_kernel_attribute;
18+
19+
enum class cache_config_enum : std::uint16_t { large_slm, large_data };
20+
21+
inline constexpr cache_config_enum large_slm =
22+
cache_config_enum::large_slm;
23+
inline constexpr cache_config_enum large_data =
24+
cache_config_enum::large_data;
25+
26+
struct cache_config {
27+
cache_config(cache_config_enum v) : value(v) {}
28+
cache_config_enum value;
29+
};
30+
31+
using cache_config_key = cache_config;
32+
33+
inline bool operator==(const cache_config &lhs,
34+
const cache_config &rhs) {
35+
return lhs.value == rhs.value;
36+
}
37+
inline bool operator!=(const cache_config &lhs,
38+
const cache_config &rhs) {
39+
return !(lhs == rhs);
40+
}
41+
42+
} // namespace ext::intel::experimental
43+
44+
namespace ext::oneapi::experimental {
45+
template <>
46+
struct is_property_key<intel::experimental::cache_config_key>
47+
: std::true_type {};
48+
49+
template <typename T, typename PropertyListT>
50+
struct is_property_key_of<
51+
intel::experimental::cache_config_key,
52+
intel::experimental::gpu_kernel_attribute<T, PropertyListT>>
53+
: std::true_type {};
54+
55+
namespace detail {
56+
template <> struct PropertyToKind<intel::experimental::cache_config_key> {
57+
static constexpr PropKind Kind = PropKind::CacheConfig;
58+
};
59+
60+
template <>
61+
struct IsRuntimeProperty<intel::experimental::cache_config_key>
62+
: std::true_type {};
63+
64+
} // namespace detail
65+
} // namespace ext::oneapi::experimental
66+
} // __SYCL_INLINE_VER_NAMESPACE(_V1)
67+
} // namespace sycl

sycl/include/sycl/ext/oneapi/properties/property.hpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -187,8 +187,9 @@ enum PropKind : uint32_t {
187187
MaxBurst = 21,
188188
WaitRequest = 22,
189189
Alignment = 23,
190+
CacheConfig = 24,
190191
// PropKindSize must always be the last value.
191-
PropKindSize = 24,
192+
PropKindSize = 25,
192193
};
193194

194195
// This trait must be specialized for all properties and must have a unique

0 commit comments

Comments
 (0)