You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update the sycl_ext_oneapi_dot_accumulate extension spec to:
* Use the latest specification template.
* Document the "packed" APIs. These were previously shown in the
"sample header" section, but there was no description.
* Deprecate the old "packed" overloads and introduce new overloads
with the more descriptive name `doc_acc_4x8packed`. This new name
is consistent with the OpenCL C naming style for similar functions.
// This section needs to be after the document title.
4
7
:doctype: book
8
+
:toc2:
9
+
:toc: left
5
10
:encoding: utf-8
6
11
:lang: en
12
+
:dpcpp: pass:[DPC++]
7
13
8
-
:blank: pass:[ +]
14
+
// Set the default source code type in this document to C++,
15
+
// for syntax highlighting purposes. This is needed because
16
+
// docbook uses c++ and html5 uses cpp.
17
+
:language: {basebackend@docbook:c++:cpp}
9
18
10
-
// Set the default source code type in this document to C,
11
-
// for syntax highlighting purposes.
12
-
:language: c
13
19
14
-
// This is what is needed for C++, since docbook uses c++
15
-
// and everything else uses cpp. This doesn't work when
16
-
// source blocks are in table cells, though, so don't use
17
-
// C++ unless it is required.
18
-
//:language: {basebackend@docbook:c++:cpp}
20
+
== Notice
21
+
22
+
[%hardbreaks]
23
+
Copyright (C) 2020-2023 Intel Corporation. All rights reserved.
19
24
20
-
== Introduction
25
+
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
26
+
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
27
+
permission by Khronos.
21
28
22
-
IMPORTANT: This specification is a draft.
23
29
24
-
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.
30
+
== Contact
25
31
26
-
NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons.
32
+
To report problems with this extension, please open a new issue at:
27
33
28
-
== Name Strings
34
+
https://github.com/intel/llvm/issues
29
35
30
-
`SYCL_ONEAPI_dot_accumulate`
31
36
32
-
This is a placeholder name.
37
+
== Dependencies
33
38
34
-
== Notice
39
+
This extension is written against the SYCL 2020 revision 7 specification. All
40
+
references below to the "core SYCL specification" or to section numbers in the
41
+
SYCL specification refer to that revision.
35
42
36
-
Copyright (c) 2020 Intel Corporation. All rights reserved.
37
43
38
44
== Status
39
45
40
-
Working Draft
46
+
This extension is implemented and fully supported by {dpcpp}.
41
47
42
-
This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.
43
48
44
-
Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products.
49
+
== Overview
45
50
46
-
== Version
51
+
This extension adds new SYCL built-in functions that may simplify development
52
+
and provide access specialized hardware instructions when a SYCL kernel needs
53
+
to perform a dot product of two vectors followed by a scalar accumulation.
47
54
48
-
Built On: {docdate} +
49
-
Revision: 3
50
55
51
-
== Contact
56
+
== Specification
52
57
53
-
Ben Ashbaugh, Intel (ben 'dot' ashbaugh 'at' intel 'dot' com)
58
+
=== Feature test macro
54
59
55
-
== Dependencies
60
+
This extension provides a feature-test macro as described in the core SYCL
61
+
specification. An implementation supporting this extension must predefine the
62
+
macro `SYCL_EXT_ONEAPI_DOT_ACCUMULATE` to one of the values defined in the table
63
+
below. Applications can test for the existence of this macro to determine if
64
+
the implementation supports this feature, or applications can test the macro's
65
+
value to determine which of the extension's features the implementation
66
+
supports.
56
67
57
-
This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-6.
68
+
[%header,cols="1,5"]
69
+
|===
70
+
|Value
71
+
|Description
58
72
59
-
== Overview
73
+
|1
74
+
|Initial version of this extension.
75
+
|===
60
76
61
-
This extension adds new SYCL built-in functions that may simplify development and provide access specialized hardware instructions when a SYCL kernel needs to perform a dot product of two vectors followed by a scalar accumulation.
77
+
=== New functions to compute the dot product of vectors
62
78
63
-
== Enabling the extension
79
+
This extension adds the following free functions:
64
80
65
-
The extension is always enabled. The dot product functionality may be emulated in software or executed using hardware when suitable instructions are available.
81
+
[source,c++]
82
+
----
83
+
namespace sycl::ext::oneapi {
66
84
67
-
== Modifications of SYCL 1.2.1 specification
85
+
int32_t dot_acc(vec<int8_t,4> a, vec<int8_t,4> b, int32_t c);
86
+
int32_t dot_acc(vec<int8_t,4> a, vec<uint8_t,4> b, int32_t c);
87
+
int32_t dot_acc(vec<uint8_t,4> a, vec<int8_t,4> b, int32_t c);
88
+
int32_t dot_acc(vec<uint8_t,4> a, vec<uint8_t,4> b, int32_t c);
68
89
69
-
=== Add to Section 4.13.6 - Geometric Functions
90
+
int32_t doc_acc_4x8packed(int32_t a, int32_t b, int32_t c);
91
+
int32_t doc_acc_4x8packed(int32_t a, uint32_t b, int32_t c);
92
+
int32_t doc_acc_4x8packed(uint32_t a, int32_t b, int32_t c);
93
+
int32_t doc_acc_4x8packed(uint32_t a, uint32_t b, int32_t c);
70
94
71
-
Additionally, the following additional functions are available in the namespace `sycl::intel` on the host and device.
95
+
} // namespace sycl::ext::oneapi
96
+
----
72
97
73
98
[cols="4a,4",options="header"]
74
99
|====
@@ -95,56 +120,59 @@ int32_t dot_acc(vec<uint8_t,4> a,
95
120
{blank}
96
121
The value that is returned is equivalent to +
97
122
{blank}
98
-
*dot*(_a_, _b_) + _c_
123
+
`dot(a, b) + c`
124
+
125
+
|[source,c]
126
+
----
127
+
int32_t doc_acc_4x8packed(int32_t a,
128
+
int32_t b,
129
+
int32_t c)
130
+
int32_t doc_acc_4x8packed(int32_t a,
131
+
uint32_t b,
132
+
int32_t c)
133
+
int32_t doc_acc_4x8packed(uint32_t a,
134
+
int32_t b,
135
+
int32_t c)
136
+
int32_t doc_acc_4x8packed(uint32_t a,
137
+
uint32_t b,
138
+
int32_t c);
139
+
----
140
+
141
+
|Performs a four-component integer dot product accumulate operation, where
142
+
`a` and `b` are 32-bit integers that represent a vector of 4 8-bit elements.
143
+
When the type of `a` or `b` is `int32_t`, it is interpreted as `vec<int8_t,4>`.
144
+
When the type of `a` or `b` is `uint32_t`, it is interpreted as
145
+
`vec<uint8_t,4>`. In each case, the least significant byte is element 0, and
146
+
the most significant byte is element 3.
99
147
100
148
|====
101
149
102
-
== Sample Header
150
+
=== Deprecated functions
151
+
152
+
The following functions are deprecated. They have the same effect as the
153
+
`doc_acc_4x8packed` overloads described above.
103
154
104
155
[source,c++]
105
156
----
106
-
namespace cl {
107
-
namespace sycl {
108
-
namespace ext {
109
-
namespace oneapi {
110
-
111
-
int32_t dot_acc(vec<int8_t,4> a, vec<int8_t,4> b, int32_t c);
112
-
int32_t dot_acc(vec<int8_t,4> a, vec<uint8_t,4> b, int32_t c);
113
-
int32_t dot_acc(vec<uint8_t,4> a, vec<int8_t,4> b, int32_t c);
114
-
int32_t dot_acc(vec<uint8_t,4> a, vec<uint8_t,4> b, int32_t c);
157
+
namespace sycl::ext::oneapi {
115
158
116
159
int32_t dot_acc(int32_t a, int32_t b, int32_t c);
117
160
int32_t dot_acc(int32_t a, uint32_t b, int32_t c);
118
161
int32_t dot_acc(uint32_t a, int32_t b, int32_t c);
119
162
int32_t dot_acc(uint32_t a, uint32_t b, int32_t c);
120
163
121
-
} // oneapi
122
-
} // ext
123
-
} // sycl
124
-
} // cl
164
+
} // namespace sycl::ext::oneapi
125
165
----
126
166
167
+
127
168
== Issues
128
169
129
-
None.
130
-
131
-
== Revision History
132
-
133
-
[cols="5,15,15,70"]
134
-
[grid="rows"]
135
-
[options="header"]
136
-
|========================================
137
-
|Rev|Date|Author|Changes
138
-
|1|2019-12-13|Ben Ashbaugh|*Initial draft*
139
-
|2|2019-12-18|Ben Ashbaugh|Switched to standard C++ fixed width types.
0 commit comments