Skip to content

Commit f90e9d8

Browse files
[SYCL] Add code location data to XPTI trace in case of exception thrown (#8101)
Adds extra information about code location of command submission to exception message. Now instrumented a few sycl::queue non-variadic methods which takes code location from parameter. Also instrumented variadic parallel_for where code location is extracted from kernel info. --------- Signed-off-by: Tikhomirova, Kseniya <[email protected]>
1 parent 2b8faae commit f90e9d8

26 files changed

+983
-72
lines changed

sycl/doc/design/SYCLInstrumentationUsingXPTI.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ by the SYCL runtime.
241241

242242
## SYCL Stream `"sycl"` Notification Signatures
243243

244-
All trace point types in bold provide semantic information about the graph, nodes and edges and the toplogy of the asynchronous task graphs created by the runtime.
244+
All trace point types in bold provide semantic information about the graph, nodes and edges and the topology of the asynchronous task graphs created by the runtime.
245245
| Trace Point Type | Parameter Description | Metadata |
246246
| :--------------: | :-------------------- | :------- |
247247
| **`graph_create`** | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::graph_create` that marks the creation of an asynchronous graph.</li> <li> **parent**: `nullptr`</li> <li> **event**: The global asynchronous graph object ID. All other graph related events such as node and edge creation will always this ID as the parent ID. </li> <li> **instance**: Unique ID related to the event, but not a correlation ID as there are other events to correlate to. </li> <li> **user_data**: `nullptr`</li> <p></p> SYCL runtime will always have one instance of a graph object with many disjoint subgraphs that get created during the execution of an application. </div> | None |
@@ -256,6 +256,7 @@ All trace point types in bold provide semantic information about the graph, node
256256
| `wait_end` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::wait_end` that marks the beginning of the wait on an `event`</li> <li> **parent**: `nullptr`</li> <li> **event**: The event ID will reflect the ID of the command group object submission that created this event or a new event based on the combination of the string "queue.wait" and the address of the event. </li> <li> **instance**: Unique ID to allow the correlation of the `wait_begin` event with the `wait_end` event. </li> <li> **user_data**: String indicating `queue.wait` and the address of the event as `const char *` </li></div> | `sycl_device`, `sycl_device_type`, `sycl_device_name`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no` |
257257
| `barrier_begin` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::barrier_begin` that marks the beginning of a barrier while enqueuing a command group object</li> <li> **parent**: The global graph event that is created during the `graph_create` event.</li> <li> **event**: The event ID will reflect the ID of the command group object that has encountered a barrier during the enqueue operation. </li> <li> **instance**: Unique ID to allow the correlation of the `barrier_begin` event with the `barrier_end` event. </li> <li> **user_data**: String indicating `enqueue.barrier` and the reason for the barrier as a `const char *` </li> <p></p>The reason for the barrier could be one of `Buffer locked by host accessor`, `Blocked by host task` or `Unknown reason`.</div> | <li> Computational Kernels </li> `sycl_device`, `sycl_device_type`, `sycl_device_name`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no` <li>Memory operations</li> `memory_object`, `offset`, `access_range_start`, `access_range_end`, `allocation_type`, `copy_from`, `copy_to` |
258258
| `barrier_end` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::barrier_end` that marks the end of the barrier that is encountered during enqueue.</li> <li> **parent**: The global graph event that is created during the `graph_create` event.</li> <li> **event**: The event ID will reflect the ID of the command group object that has encountered a barrier during the enqueue operation. </li> <li> **instance**: Unique ID to allow the correlation of the `barrier_begin` event with the `barrier_end` event. </li> <li> **user_data**: String indicating `enqueue.barrier` and the reason for the barrier as a `const char *` </li> <p></p>The reason for the barrier could be one of `Buffer locked by host accessor`, `Blocked by host task` or `Unknown reason`.</div> | <li> Computational Kernels </li> `sycl_device`, `sycl_device_type`, `sycl_device_name`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no` <li>Memory operations</li> `memory_object`, `offset`, `access_range_start`, `access_range_end`, `allocation_type`, `copy_from`, `copy_to` |
259+
| `diagnostics` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::diagnostics` that represents general purpose notifications. For example, it is emitted when an exception is thrown in SYCL runtime. </li> <li> **parent**: Set to NULL.</li> <li> **event**: The event ID will reflect the code location of notification origin, if available. </li> <li> **instance**: An instance ID that records the number of times this code location has been seen. </li> <li> **user_data**: String with diagnostic message as a `const char *` </li></div> | `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no` |
259260

260261
### Metadata description
261262

sycl/include/sycl/handler.hpp

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -656,7 +656,6 @@ class __SYCL_EXPORT handler {
656656
typename LambdaArgType>
657657
void StoreLambda(KernelType KernelFunc) {
658658
using KI = detail::KernelInfo<KernelName>;
659-
660659
constexpr bool IsCallableWithKernelHandler =
661660
detail::KernelLambdaHasKernelHandlerArgT<KernelType,
662661
LambdaArgType>::value;
@@ -670,7 +669,6 @@ class __SYCL_EXPORT handler {
670669
KernelType *KernelPtr =
671670
ResetHostKernel<KernelType, LambdaArgType, Dims>(KernelFunc);
672671

673-
using KI = sycl::detail::KernelInfo<KernelName>;
674672
constexpr bool KernelHasName =
675673
KI::getName() != nullptr && KI::getName()[0] != '\0';
676674

sycl/include/sycl/queue.hpp

Lines changed: 37 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -307,7 +307,7 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
307307
/// \return a SYCL event object for the submitted command group.
308308
template <typename T> event submit(T CGF _CODELOCPARAM(&CodeLoc)) {
309309
_CODELOCARG(&CodeLoc);
310-
310+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
311311
#if __SYCL_USE_FALLBACK_ASSERT
312312
auto PostProcess = [this, &CodeLoc](bool IsKernel, bool KernelUsesAssert,
313313
event &E) {
@@ -343,7 +343,7 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
343343
template <typename T>
344344
event submit(T CGF, queue &SecondaryQueue _CODELOCPARAM(&CodeLoc)) {
345345
_CODELOCARG(&CodeLoc);
346-
346+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
347347
#if __SYCL_USE_FALLBACK_ASSERT
348348
auto PostProcess = [this, &SecondaryQueue, &CodeLoc](
349349
bool IsKernel, bool KernelUsesAssert, event &E) {
@@ -432,7 +432,7 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
432432
/// @param CodeLoc is the code location of the submit call (default argument)
433433
void wait(_CODELOCONLYPARAM(&CodeLoc)) {
434434
_CODELOCARG(&CodeLoc);
435-
435+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
436436
wait_proxy(CodeLoc);
437437
}
438438

@@ -446,7 +446,7 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
446446
/// @param CodeLoc is the code location of the submit call (default argument)
447447
void wait_and_throw(_CODELOCONLYPARAM(&CodeLoc)) {
448448
_CODELOCARG(&CodeLoc);
449-
449+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
450450
wait_and_throw_proxy(CodeLoc);
451451
}
452452

@@ -1345,6 +1345,7 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
13451345
"sycl::queue.single_task() requires a kernel instead of command group. "
13461346
"Use queue.submit() instead");
13471347
_CODELOCARG(&CodeLoc);
1348+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
13481349
return submit(
13491350
[&](handler &CGH) {
13501351
CGH.template single_task<KernelName, KernelType, PropertiesT>(
@@ -1384,6 +1385,7 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
13841385
"sycl::queue.single_task() requires a kernel instead of command group. "
13851386
"Use queue.submit() instead");
13861387
_CODELOCARG(&CodeLoc);
1388+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
13871389
return submit(
13881390
[&](handler &CGH) {
13891391
CGH.depends_on(DepEvent);
@@ -1427,6 +1429,7 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
14271429
"sycl::queue.single_task() requires a kernel instead of command group. "
14281430
"Use queue.submit() instead");
14291431
_CODELOCARG(&CodeLoc);
1432+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
14301433
return submit(
14311434
[&](handler &CGH) {
14321435
CGH.depends_on(DepEvents);
@@ -1667,8 +1670,11 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
16671670
ext::oneapi::experimental::is_property_list<PropertiesT>::value,
16681671
event>
16691672
parallel_for(nd_range<Dims> Range, PropertiesT Properties, RestT &&...Rest) {
1670-
// Actual code location needs to be captured from KernelInfo object.
1671-
const detail::code_location CodeLoc = {};
1673+
using KI = sycl::detail::KernelInfo<KernelName>;
1674+
constexpr detail::code_location CodeLoc(
1675+
KI::getFileName(), KI::getFunctionName(), KI::getLineNumber(),
1676+
KI::getColumnNumber());
1677+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
16721678
return submit(
16731679
[&](handler &CGH) {
16741680
CGH.template parallel_for<KernelName>(Range, Properties, Rest...);
@@ -1701,8 +1707,11 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
17011707
template <typename KernelName = detail::auto_name, int Dims,
17021708
typename... RestT>
17031709
event parallel_for(nd_range<Dims> Range, event DepEvent, RestT &&...Rest) {
1704-
// Actual code location needs to be captured from KernelInfo object.
1705-
const detail::code_location CodeLoc = {};
1710+
using KI = sycl::detail::KernelInfo<KernelName>;
1711+
constexpr detail::code_location CodeLoc(
1712+
KI::getFileName(), KI::getFunctionName(), KI::getLineNumber(),
1713+
KI::getColumnNumber());
1714+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
17061715
return submit(
17071716
[&](handler &CGH) {
17081717
CGH.depends_on(DepEvent);
@@ -1723,8 +1732,11 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
17231732
typename... RestT>
17241733
event parallel_for(nd_range<Dims> Range, const std::vector<event> &DepEvents,
17251734
RestT &&...Rest) {
1726-
// Actual code location needs to be captured from KernelInfo object.
1727-
const detail::code_location CodeLoc = {};
1735+
using KI = sycl::detail::KernelInfo<KernelName>;
1736+
constexpr detail::code_location CodeLoc(
1737+
KI::getFileName(), KI::getFunctionName(), KI::getLineNumber(),
1738+
KI::getColumnNumber());
1739+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
17281740
return submit(
17291741
[&](handler &CGH) {
17301742
CGH.depends_on(DepEvents);
@@ -1950,8 +1962,11 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
19501962
event>
19511963
parallel_for_impl(range<Dims> Range, PropertiesT Properties,
19521964
RestT &&...Rest) {
1953-
// Actual code location needs to be captured from KernelInfo object.
1954-
const detail::code_location CodeLoc = {};
1965+
using KI = sycl::detail::KernelInfo<KernelName>;
1966+
constexpr detail::code_location CodeLoc(
1967+
KI::getFileName(), KI::getFunctionName(), KI::getLineNumber(),
1968+
KI::getColumnNumber());
1969+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
19551970
return submit(
19561971
[&](handler &CGH) {
19571972
CGH.template parallel_for<KernelName>(Range, Properties, Rest...);
@@ -1985,8 +2000,11 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
19852000
ext::oneapi::experimental::is_property_list<PropertiesT>::value, event>
19862001
parallel_for_impl(range<Dims> Range, event DepEvent, PropertiesT Properties,
19872002
RestT &&...Rest) {
1988-
// Actual code location needs to be captured from KernelInfo object.
1989-
const detail::code_location CodeLoc = {};
2003+
using KI = sycl::detail::KernelInfo<KernelName>;
2004+
constexpr detail::code_location CodeLoc(
2005+
KI::getFileName(), KI::getFunctionName(), KI::getLineNumber(),
2006+
KI::getColumnNumber());
2007+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
19902008
return submit(
19912009
[&](handler &CGH) {
19922010
CGH.depends_on(DepEvent);
@@ -2022,8 +2040,11 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
20222040
ext::oneapi::experimental::is_property_list<PropertiesT>::value, event>
20232041
parallel_for_impl(range<Dims> Range, const std::vector<event> &DepEvents,
20242042
PropertiesT Properties, RestT &&...Rest) {
2025-
// Actual code location needs to be captured from KernelInfo object.
2026-
const detail::code_location CodeLoc = {};
2043+
using KI = sycl::detail::KernelInfo<KernelName>;
2044+
constexpr detail::code_location CodeLoc(
2045+
KI::getFileName(), KI::getFunctionName(), KI::getLineNumber(),
2046+
KI::getColumnNumber());
2047+
detail::tls_code_loc_t TlsCodeLocCapture(CodeLoc);
20272048
return submit(
20282049
[&](handler &CGH) {
20292050
CGH.depends_on(DepEvents);

sycl/source/CMakeLists.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
#2. Use AddLLVM to modify the build and access config options
44
#cmake_policy(SET CMP0057 NEW)
55
#include(AddLLVM)
6-
76
configure_file(
87
${CMAKE_CURRENT_SOURCE_DIR}/version.rc.in
98
${CMAKE_CURRENT_BINARY_DIR}/version.rc

sycl/source/detail/global_handler.cpp

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,49 @@ std::atomic_uint ObjectUsageCounter::MCounter{0};
7171
GlobalHandler::GlobalHandler() = default;
7272
GlobalHandler::~GlobalHandler() = default;
7373

74+
void GlobalHandler::InitXPTI() {
75+
#ifdef XPTI_ENABLE_INSTRUMENTATION
76+
// Let subscribers know a new stream is being initialized
77+
getXPTIRegistry().initializeStream(SYCL_STREAM_NAME, GMajVer, GMinVer,
78+
GVerStr);
79+
xpti::payload_t SYCLPayload("SYCL Runtime Exceptions");
80+
uint64_t SYCLInstanceNo;
81+
GSYCLCallEvent = xptiMakeEvent("SYCL Try-catch Exceptions", &SYCLPayload,
82+
xpti::trace_algorithm_event, xpti_at::active,
83+
&SYCLInstanceNo);
84+
#endif
85+
}
86+
87+
void GlobalHandler::TraceEventXPTI(const char *Message) {
88+
#ifdef XPTI_ENABLE_INSTRUMENTATION
89+
if (!Message)
90+
return;
91+
if (xptiTraceEnabled()) {
92+
// We have to handle the cases where: (1) we may have just the code location
93+
// set and not UID and (2) UID set
94+
detail::tls_code_loc_t Tls;
95+
auto CodeLocation = Tls.query();
96+
97+
// Creating a tracepoint will convert a CodeLocation to UID, if not set
98+
xpti::framework::tracepoint_t TP(
99+
CodeLocation.fileName(), CodeLocation.functionName(),
100+
CodeLocation.lineNumber(), CodeLocation.columnNumber(), nullptr);
101+
102+
// The call to notify will have the signature of:
103+
// (1) the stream defined in .stream()
104+
// (2) The trace type equal to what is set by .trace_type()
105+
// (3) Parent event set to NULL
106+
// (4) Current event set to one created from CodeLocation and UID
107+
// (5) An instance ID that records the number of times this code location
108+
// has been seen (6) The message generated by the exception handler
109+
TP.stream(SYCL_STREAM_NAME)
110+
.trace_type(xpti::trace_point_type_t::diagnostics)
111+
.notify(static_cast<const void *>(Message));
112+
}
113+
114+
#endif
115+
}
116+
74117
GlobalHandler *&GlobalHandler::getInstancePtr() {
75118
static GlobalHandler *RTGlobalObjHandler = new GlobalHandler();
76119
return RTGlobalObjHandler;
@@ -79,6 +122,13 @@ GlobalHandler *&GlobalHandler::getInstancePtr() {
79122
GlobalHandler &GlobalHandler::instance() {
80123
GlobalHandler *RTGlobalObjHandler = GlobalHandler::getInstancePtr();
81124
assert(RTGlobalObjHandler && "Handler must not be deallocated earlier");
125+
126+
#ifdef XPTI_ENABLE_INSTRUMENTATION
127+
static std::once_flag InitXPTIFlag;
128+
if (xptiTraceEnabled()) {
129+
std::call_once(InitXPTIFlag, [&]() { RTGlobalObjHandler->InitXPTI(); });
130+
}
131+
#endif
82132
return *RTGlobalObjHandler;
83133
}
84134

sycl/source/detail/global_handler.hpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,10 +79,17 @@ class GlobalHandler {
7979
void drainThreadPool();
8080
void prepareSchedulerToRelease();
8181

82+
void InitXPTI();
83+
void TraceEventXPTI(const char *Message);
84+
8285
// For testing purposes only
8386
void attachScheduler(Scheduler *Scheduler);
8487

8588
private:
89+
#ifdef XPTI_ENABLE_INSTRUMENTATION
90+
void *GSYCLCallEvent = nullptr;
91+
#endif
92+
8693
friend void shutdown();
8794
friend class ObjectUsageCounter;
8895
static GlobalHandler *&getInstancePtr();

sycl/source/detail/pi.cpp

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,6 @@
3838
#include "xpti/xpti_trace_framework.h"
3939
#endif
4040

41-
#define STR(x) #x
42-
#define SYCL_VERSION_STR \
43-
"sycl " STR(__LIBSYCL_MAJOR_VERSION) "." STR(__LIBSYCL_MINOR_VERSION)
44-
4541
namespace sycl {
4642
__SYCL_INLINE_VER_NAMESPACE(_V1) {
4743
namespace detail {
@@ -54,11 +50,6 @@ xpti_td *GSYCLGraphEvent = nullptr;
5450
xpti_td *GPICallEvent = nullptr;
5551
/// Event to be used by PI layer calls with arguments
5652
xpti_td *GPIArgCallEvent = nullptr;
57-
/// Constants being used as placeholder until one is able to reliably get the
58-
/// version of the SYCL runtime
59-
constexpr uint32_t GMajVer = __LIBSYCL_MAJOR_VERSION;
60-
constexpr uint32_t GMinVer = __LIBSYCL_MINOR_VERSION;
61-
constexpr const char *GVerStr = SYCL_VERSION_STR;
6253
#endif // XPTI_ENABLE_INSTRUMENTATION
6354

6455
template <sycl::backend BE> void *getPluginOpaqueData(void *OpaqueDataParam) {

0 commit comments

Comments
 (0)