1
- = sycl_ext_oneapi_record_event
1
+ = sycl_ext_oneapi_profiling_tag
2
2
3
3
:source-highlighter: coderay
4
4
:coderay-linenums-mode: table
@@ -78,7 +78,8 @@ sycl_ext_oneapi_enqueue_functions]
78
78
This extension provides a feature-test macro as described in the core SYCL
79
79
specification.
80
80
An implementation supporting this extension must predefine the macro
81
- `SYCL_EXT_ONEAPI_RECORD_EVENT` to one of the values defined in the table below.
81
+ `SYCL_EXT_ONEAPI_PROFILING_TAG` to one of the values defined in the table
82
+ below.
82
83
Applications can test for the existence of this macro to determine if the
83
84
implementation supports this feature, or applications can test the macro's
84
85
value to determine which of the extension's features the implementation
@@ -96,22 +97,22 @@ supports.
96
97
97
98
=== New device aspect
98
99
99
- This extension adds the `ext_oneapi_queue_event_recording ` enumerator to the
100
+ This extension adds the `ext_oneapi_queue_profiling_tag ` enumerator to the
100
101
`sycl::aspect` enumeration.
101
102
102
103
```
103
104
namespace sycl {
104
105
105
106
enum class aspect : /*unspecified*/ {
106
- ext_oneapi_queue_event_recording
107
+ ext_oneapi_queue_profiling_tag
107
108
};
108
109
109
110
} // namespace sycl
110
111
```
111
112
112
- When a device has this aspect, the `record_event ` function may be called for a
113
- queue on this device even if the queue is not constructed with the property
114
- `property::queue::enable_profiling`.
113
+ When a device has this aspect, the `submit_profiling_tag ` function may be
114
+ called for a queue on this device even if the queue is not constructed with the
115
+ property `property::queue::enable_profiling`.
115
116
116
117
=== New free function
117
118
126
127
----
127
128
namespace sycl::ext::oneapi::experimental {
128
129
129
- event record_event (const queue& q);
130
+ event submit_profiling_tag (const queue& q);
130
131
131
132
} // namespace sycl::ext::oneapi::experimental
132
133
----
133
134
!====
134
135
135
- _Effects:_ Enqueues a command barrier to `q`.
136
+ _Effects:_ If the queue `q` is out-of-order (i.e. was not constructed with
137
+ `property::queue::in_order`), this function enqueues a command barrier to `q`.
136
138
Any commands submitted after this barrier cannot begin execution until all
137
139
previously submitted commands have completed.
138
-
139
- _Returns:_ An event which represents the completion of the barrier.
140
- The event's status becomes `info::event_command_status::complete` when all
141
- commands submitted to the queue prior to the call to `record_event` have
142
- completed.
140
+ If this queue is in-order, this function simply enqueues a lightweight "tag"
141
+ command that marks the current head of the queue.
142
+
143
+ _Returns:_ If the queue is out-of-order, returns an event which represents the
144
+ completion of the barrier.
145
+ If the queue is in-order, returns an event which represents the completion of
146
+ the "tag" command.
147
+ In either case, the event's status becomes
148
+ `info::event_command_status::complete` when all commands submitted to the queue
149
+ prior to the call to `submit_profiling_tag` have completed.
143
150
The event's `info::event_profiling::command_submit` timestamp reflects the
144
- time at which `record_event ` is called.
151
+ time at which `submit_profiling_tag ` is called.
145
152
The event's `info::event_profiling::command_end` timestamp reflects the time
146
153
at which the event enters the "complete" state.
147
154
@@ -155,7 +162,7 @@ Implementations are encouraged to transition the event directly from the
155
162
_Throws:_ A synchronous `exception` with the `errc::invalid` error code if the
156
163
queue was not constructed with the `property::queue::enable_profiling` property
157
164
and if the queue's device does not have the aspect
158
- `ext_oneapi_queue_event_recording `.
165
+ `ext_oneapi_queue_profiling_tag `.
159
166
160
167
[_Note:_ In order to understand why the "command_start" and "command_end"
161
168
timestamps are encouraged to be the same, think of the barrier as an empty
@@ -185,17 +192,17 @@ static constexpr size_t N = 1024;
185
192
int main() {
186
193
sycl::queue q;
187
194
188
- if (!q.get_device().has(sycl::aspect::ext_oneapi_queue_event_recording )) {
195
+ if (!q.get_device().has(sycl::aspect::ext_oneapi_queue_profiling_tag )) {
189
196
std::cout << "Cannot time kernels without enabling profiling on queue\n";
190
197
return;
191
198
}
192
199
193
200
// commands submitted here are not timed
194
201
195
- sycl::event start = syclex::record_event (q);
202
+ sycl::event start = syclex::submit_profiling_tag (q);
196
203
sycl::parallel_for(q, {N}, [=](auto i) {/* first kernel */});
197
204
sycl::parallel_for(q, {N}, [=](auto i) {/* second kernel */});
198
- sycl::event end = syclex::record_event (q);
205
+ sycl::event end = syclex::submit_profiling_tag (q);
199
206
200
207
q.wait();
201
208
@@ -205,15 +212,3 @@ int main() {
205
212
std::cout << "Execution time: " << elapsed << " (nanoseconds)\n";
206
213
}
207
214
```
208
-
209
-
210
- == Issues
211
-
212
- . Is the name `record_event` confusing?
213
- +
214
- --
215
- *UNRESOLVED*: The current name is similar to the CUDA API `cudaEventRecord`,
216
- which has similar functionality.
217
- However, the word "record" may be confused with the recording functionality
218
- associated with SYCL graphs.
219
- --
0 commit comments