Skip to content

[SYCL][L0] Add XPTI-based Level Zero tracing #5796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
1 change: 1 addition & 0 deletions sycl/doc/EnvironmentVariables.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ variables in production code.</span>
| `SYCL_PI_LEVEL_ZERO_USE_COMPUTE_ENGINE` | Integer | It can be set to an integer (>=0) in which case all compute commands will be submitted to the command-queue with the given index in the compute command group. If it is instead set to a negative value then all available compute engines may be used. The default value is "0" |
| `SYCL_PI_LEVEL_ZERO_USE_COPY_ENGINE_FOR_D2D_COPY` (experimental) | Integer | Allows the use of copy engine, if available in the device, in Level Zero plugin for device to device copy operations. The default is 0. This option is experimental and will be removed once heuristics are added to make a decision about use of copy engine for device to device copy operations. |
| `SYCL_PI_LEVEL_ZERO_DEVICE_SCOPE_EVENTS` | Any(\*) | Enable support of device-scope events whose state is not visible to the host. If enabled mode is SYCL_PI_LEVEL_ZERO_DEVICE_SCOPE_EVENTS=1 the Level Zero plugin would create all events having device-scope only and create proxy host-visible events for them when their status is needed (wait/query) on the host. If enabled mode is SYCL_PI_LEVEL_ZERO_DEVICE_SCOPE_EVENTS=2 the Level Zero plugin would create all events having device-scope and add proxy host-visible event at the end of each command-list submission. The default is 0, meaning all events are host-visible. |
| `SYCL_PI_LEVEL_ZERO_ENABLE_TRACING` | Any(\*) | Enable XPTI-based tracing in L0 plugin |

## Debugging variables for CUDA Plugin

Expand Down
20 changes: 20 additions & 0 deletions sycl/doc/design/SYCLInstrumentationUsingXPTI.md
Original file line number Diff line number Diff line change
Expand Up @@ -299,3 +299,23 @@ All trace point types in bold provide semantic information about the graph, node
| `mem_alloc_end` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::mem_alloc_end` that marks the end of memory allocation process</li> <li> **parent**: Event ID created for all functions in the `oneapi.level_zero.experimental.mem_alloc` layer.</li> <li> **event**: `nullptr` - since the stream of data just captures functions being called.</li> <li> **instance**: Unique ID to allow the correlation of the `mem_alloc_begin` event with the `mem_alloc_end` event. This value is guaranteed to be the same value received by the trace event for the corresponding `mem_alloc_begin`.</li> <li> **user_data**: A pointer to `mem_alloc_data_t` object, that includes memory object ID (if any), allocated pointer, allocation size, and guard zone size (if any). </li></div> | None |
| `mem_release_begin` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::mem_release_begin` that marks the beginning of memory allocation process</li> <li> **parent**: Event ID created for all functions in the `oneapi.level_zero.experimental.mem_alloc` layer.</li> <li> **event**: `nullptr` - since the stream of data just captures functions being called.</li> <li> **instance**: Unique ID to allow the correlation of the `mem_release_begin` event with the `mem_release_end` event. </li> <li> **user_data**: A pointer to `mem_alloc_data_t` object, that includes memory object ID (if any) and released pointer. </li></div> | None |
| `mem_release_end` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::mem_release_end` that marks the end of memory allocation process</li> <li> **parent**: Event ID created for all functions in the `oneapi.level_zero.experimental.mem_alloc` layer.</li> <li> **event**: `nullptr` - since the stream of data just captures functions being called.</li> <li> **instance**: Unique ID to allow the correlation of the `mem_release_begin` event with the `mem_release_end` event. This value is guaranteed to be the same value received by the trace event for the corresponding `mem_release_begin`.</li> <li> **user_data**: A pointer to `mem_alloc_data_t` object, that includes memory object ID (if any) and released pointer. </li></div> | None |

## SYCL Stream `"sycl.experimental.level_zero.call"` Notification Signatures

This stream transfers events about Level Zero API calls made by SYCL
application.

| Trace Point Type | Parameter Description | Metadata |
| :--------------: | :-------------------- | :------- |
| `function_begin` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::function_begin` that marks the beginning of a function</li> <li> **parent**: Event ID created for all functions in the `sycl.pi` layer.</li> <li> **event**: `nullptr` - since the stream of data just captures functions being called.</li> <li> **instance**: Unique ID to allow the correlation of the `function_begin` event with the `function_end` event. </li> <li> **user_data**: Name of the function being called sent in as `const char *` </li></div> | None |
| `function_end` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::function_end` that marks the beginning of a function</li> <li> **parent**: Event ID created for all functions in the `sycl.pi` layer.</li> <li> **event**: `nullptr` - since the stream of data just captures functions being called.</li> <li> **instance**: Unique ID to allow the correlation of the `function_begin` event with the `function_end` event. This value is guaranteed to be the same value received by the trace event for the corresponding `function_begin` </li> <li> **user_data**: Name of the function being called sent in as `const char *` </li></div> | None |

## SYCL Stream `"sycl.experimental.level_zero.debug"` Notification Signatures

This stream transfers events about Level Zero API calls and their function
arguments made by SYCL application.

| Trace Point Type | Parameter Description | Metadata |
| :------------------------: | :-------------------- | :------- |
| `function_with_args_begin` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::function_with_args_begin` that marks the beginning of a function</li> <li> **parent**: Event ID created for all functions in the `sycl.pi.debug` layer.</li> <li> **event**: `nullptr` - since the stream of data just captures functions being called.</li> <li> **instance**: Unique ID to allow the correlation of the `function_with_args_begin` event with the `function_with_args_end` event. </li> <li> **user_data**: A pointer to `function_with_args_t` object, that includes function ID, name, and arguments. </li></div> | None |
| `function_with_args_end` | <div style="text-align: left"><li>**trace_type**: `xpti::trace_point_type_t::function_with_args_end` that marks the beginning of a function</li> <li> **parent**: Event ID created for all functions in the `sycl.pi.debug` layer.</li> <li> **event**: `nullptr` - since the stream of data just captures functions being called.</li> <li> **instance**: Unique ID to allow the correlation of the `function_with_args_begin` event with the `function_with_args_end` event. This value is guaranteed to be the same value received by the trace event for the corresponding `function_with_args_begin` </li> <li> **user_data**: A pointer to `function_with_args_t` object, that includes function ID, name, arguments, and return value. </li></div> | None |
28 changes: 28 additions & 0 deletions sycl/plugins/level_zero/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,10 @@ target_include_directories(LevelZeroLoader-Headers
INTERFACE "${LEVEL_ZERO_INCLUDE_DIR}"
)

if (SYCL_ENABLE_XPTI_TRACING)
set(XPTI_PROXY_SRC "${CMAKE_SOURCE_DIR}/../xpti/src/xpti_proxy.cpp")
endif()

find_package(Threads REQUIRED)
add_sycl_plugin(level_zero
SOURCES
Expand All @@ -107,12 +111,36 @@ add_sycl_plugin(level_zero
"${CMAKE_CURRENT_SOURCE_DIR}/pi_level_zero.hpp"
"${CMAKE_CURRENT_SOURCE_DIR}/usm_allocator.cpp"
"${CMAKE_CURRENT_SOURCE_DIR}/usm_allocator.hpp"
"${CMAKE_CURRENT_SOURCE_DIR}/tracing.cpp"
${XPTI_PROXY_SRC}
LIBRARIES
"${LEVEL_ZERO_LOADER}"
Threads::Threads
)

find_package(Python3 REQUIRED)

add_custom_target(ze-api
COMMAND ${Python3_EXECUTABLE}
${CMAKE_CURRENT_SOURCE_DIR}/ze_api_generator.py
${LEVEL_ZERO_INCLUDE_DIR}/level_zero/ze_api.h
BYPRODUCTS
${CMAKE_CURRENT_BINARY_DIR}/ze_api.def
)
target_include_directories(pi_level_zero PRIVATE ${CMAKE_CURRENT_BINARY_DIR})
add_dependencies(pi_level_zero ze-api)

if (SYCL_ENABLE_XPTI_TRACING)
target_compile_definitions(pi_level_zero PRIVATE
XPTI_ENABLE_INSTRUMENTATION
XPTI_STATIC_LIBRARY
)
target_include_directories(pi_level_zero PRIVATE "${CMAKE_SOURCE_DIR}/../xpti/include")
target_link_libraries(pi_level_zero PRIVATE ${CMAKE_DL_LIBS})
endif()

if (TARGET level-zero-loader)
add_dependencies(ze-api level-zero-loader)
add_dependencies(pi_level_zero level-zero-loader)
endif()

6 changes: 6 additions & 0 deletions sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ static pi_result EventCreate(pi_context Context, pi_queue Queue,
bool HostVisible, pi_event *RetEvent);
}

void enableL0Tracing();

namespace {

// Controls Level Zero calls serialization to w/a Level Zero driver being not MT
Expand Down Expand Up @@ -7664,6 +7666,10 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
(PluginInit->PiFunctionTable).api = (decltype(&::api))(&api);
#include <CL/sycl/detail/pi.def>

if (std::getenv("SYCL_PI_LEVEL_ZERO_ENABLE_TRACING") != nullptr) {
enableL0Tracing();
}

return PI_SUCCESS;
}

Expand Down
145 changes: 145 additions & 0 deletions sycl/plugins/level_zero/tracing.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
//===-------------- tracing.cpp - L0 Host API Tracing ----------------------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "xpti/xpti_data_types.h"
#include <exception>
#include <level_zero/layers/zel_tracing_api.h>
#include <level_zero/ze_api.h>
#include <xpti/xpti_trace_framework.h>

#include <iostream>

constexpr auto L0_CALL_STREAM_NAME = "sycl.experimental.level_zero.call";
constexpr auto L0_DEBUG_STREAM_NAME = "sycl.experimental.level_zero.debug";

thread_local uint64_t CallCorrelationID = 0;
thread_local uint64_t DebugCorrelationID = 0;

constexpr auto GVerStr = "0.1";
constexpr int GMajVer = 0;
constexpr int GMinVer = 1;

#ifdef XPTI_ENABLE_INSTRUMENTATION
static xpti_td *GCallEvent = nullptr;
static xpti_td *GDebugEvent = nullptr;
#endif // XPTI_ENABLE_INSTRUMENTATION

enum class ZEApiKind {
#define _ZE_API(call, domain, cb, params_type) call,
#include "ze_api.def"
#undef _ZE_API
};

void enableL0Tracing() {
#ifdef XPTI_ENABLE_INSTRUMENTATION
if (!xptiTraceEnabled())
return;

xptiRegisterStream(L0_CALL_STREAM_NAME);
xptiInitialize(L0_CALL_STREAM_NAME, GMajVer, GMinVer, GVerStr);
xptiRegisterStream(L0_DEBUG_STREAM_NAME);
xptiInitialize(L0_DEBUG_STREAM_NAME, GMajVer, GMinVer, GVerStr);

uint64_t Dummy;
xpti::payload_t L0Payload("Level Zero Plugin Layer");
GCallEvent =
xptiMakeEvent("L0 Plugin Layer", &L0Payload, xpti::trace_algorithm_event,
xpti_at::active, &Dummy);

xpti::payload_t L0DebugPayload("L0 Plugin Debug Layer");
GDebugEvent =
xptiMakeEvent("L0 Plugin Debug Layer", &L0DebugPayload,
xpti::trace_algorithm_event, xpti_at::active, &Dummy);

ze_result_t Status = zeInit(0);
if (Status != ZE_RESULT_SUCCESS) {
// Most likey there are no Level Zero devices.
return;
}

int Foo = 0;
zel_tracer_desc_t TracerDesc = {ZEL_STRUCTURE_TYPE_TRACER_EXP_DESC, nullptr,
&Foo};
zel_tracer_handle_t Tracer = nullptr;

Status = zelTracerCreate(&TracerDesc, &Tracer);

if (Status != ZE_RESULT_SUCCESS || Tracer == nullptr) {
std::cerr << "[WARNING] Failed to create L0 tracer: " << Status << "\n";
return;
}

zel_core_callbacks_t Prologue = {};
zel_core_callbacks_t Epilogue = {};

#define _ZE_API(call, domain, cb, params_type) \
Prologue.domain.cb = [](params_type *Params, ze_result_t, void *, void **) { \
if (xptiTraceEnabled()) { \
uint8_t CallStreamID = xptiRegisterStream(L0_CALL_STREAM_NAME); \
uint8_t DebugStreamID = xptiRegisterStream(L0_DEBUG_STREAM_NAME); \
CallCorrelationID = xptiGetUniqueId(); \
DebugCorrelationID = xptiGetUniqueId(); \
const char *FuncName = #call; \
xptiNotifySubscribers( \
CallStreamID, (uint16_t)xpti::trace_point_type_t::function_begin, \
GCallEvent, nullptr, CallCorrelationID, FuncName); \
uint32_t FuncID = static_cast<uint32_t>(ZEApiKind::call); \
xpti::function_with_args_t Payload{FuncID, FuncName, Params, nullptr, \
nullptr}; \
xptiNotifySubscribers( \
DebugStreamID, \
(uint16_t)xpti::trace_point_type_t::function_with_args_begin, \
GDebugEvent, nullptr, DebugCorrelationID, &Payload); \
} \
}; \
Epilogue.domain.cb = [](params_type *Params, ze_result_t Result, void *, \
void **) { \
if (xptiTraceEnabled()) { \
uint8_t CallStreamID = xptiRegisterStream(L0_CALL_STREAM_NAME); \
uint8_t DebugStreamID = xptiRegisterStream(L0_DEBUG_STREAM_NAME); \
const char *FuncName = #call; \
xptiNotifySubscribers(CallStreamID, \
(uint16_t)xpti::trace_point_type_t::function_end, \
GCallEvent, nullptr, CallCorrelationID, FuncName); \
uint32_t FuncID = static_cast<uint32_t>(ZEApiKind::call); \
xpti::function_with_args_t Payload{FuncID, FuncName, Params, &Result, \
nullptr}; \
xptiNotifySubscribers( \
DebugStreamID, \
(uint16_t)xpti::trace_point_type_t::function_with_args_end, \
GDebugEvent, nullptr, DebugCorrelationID, &Payload); \
} \
};

#include "ze_api.def"

#undef _ZE_API

Status = zelTracerSetPrologues(Tracer, &Prologue);
if (Status != ZE_RESULT_SUCCESS) {
std::cerr << "Failed to enable L0 tracing\n";
std::terminate();
}
Status = zelTracerSetEpilogues(Tracer, &Epilogue);
if (Status != ZE_RESULT_SUCCESS) {
std::cerr << "Failed to enable L0 tracing\n";
std::terminate();
}

Status = zelTracerSetEnabled(Tracer, true);
if (Status != ZE_RESULT_SUCCESS) {
std::cerr << "Failed to enable L0 tracing\n";
std::terminate();
}
#endif
}

void disableL0Tracing() {
xptiFinalize(L0_CALL_STREAM_NAME);
xptiFinalize(L0_DEBUG_STREAM_NAME);
}
40 changes: 40 additions & 0 deletions sycl/plugins/level_zero/ze_api_generator.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import re
import sys

def camel_to_snake(src):
return re.sub(r'(?<!^)(?=[A-Z])', '_', src).lower()

def snake_to_camel(src):
temp = src.split('_')
return ''.join(x.title() for x in temp)


def extract_ze_apis(header):
"""
Emit file with contents of
_ZE_API(api_name, api_domain, cb, param_type)
"""
api = open("ze_api.def", "w")

matches = re.finditer(r'typedef struct _ze_([_a-z]+)_callbacks_t\n\{\n([a-zA-Z_;\s\n]+)\n\} ze_([_a-z]+)_callbacks_t;', header)

for match in matches:
api_domain = snake_to_camel(match.group(1))
for l in match.group(2).splitlines():
parts = l.split()
api_match = re.match(r'ze_pfn([a-zA-Z]+)Cb_t', parts[0])
api_name_tail = api_match.group(1)
api_name = 'ze' + api_name_tail

param_type = 'ze_' + camel_to_snake(api_name_tail) + '_params_t'

cb = 'pfn' + api_name_tail.replace(api_domain, '') + 'Cb'

api.write("_ZE_API({}, {}, {}, {})\n".format(api_name, api_domain, cb, param_type))

api.close()

if __name__ == "__main__":
with open(sys.argv[1], 'r') as f:
header = f.read()
extract_ze_apis(header)
Loading