Skip to content

[SYCL] Windows Proxy Loader for DLLs #8242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
d726e1c
DLLs manually loaded by SYCL are not trackd as direct dependencies in…
cperkinsintel Dec 13, 2022
1457761
stray comment removed
cperkinsintel Dec 13, 2022
052afd7
DLL_PROCESS_DETACH added to sycl_pi_trace
cperkinsintel Dec 19, 2022
c723c45
clang-format
cperkinsintel Dec 19, 2022
5b70c21
yet more clang-format
cperkinsintel Dec 20, 2022
c648c09
Merge branch 'sycl' into cperkins-win_proxy_loader
cperkinsintel Dec 21, 2022
a935f29
Merge branch 'sycl' into cperkins-win_proxy_loader
cperkinsintel Dec 23, 2022
defd2d7
unified runtime and error suppression
cperkinsintel Dec 23, 2022
06fdb87
clang-format for not the last time
cperkinsintel Dec 23, 2022
454f4c8
moar clang-format
cperkinsintel Dec 23, 2022
1064627
Merge branch 'sycl' into cperkins-win_proxy_loader
cperkinsintel Jan 2, 2023
731a83d
restoring static default context tracker. Hip seems to need it.
cperkinsintel Jan 4, 2023
a1075d9
Merge branch 'sycl' into cperkins-win_proxy_loader
cperkinsintel Jan 5, 2023
1753fd5
when using XPTI we re-encounter dll unload issues. I believe I've eli…
cperkinsintel Jan 6, 2023
dd8b47c
temp remove //shutdown
cperkinsintel Feb 7, 2023
6634c8c
merge update and limit shutdown exposure
cperkinsintel Feb 7, 2023
5088e11
ensuring win_proxy_loader is part of CMake install
cperkinsintel Jan 17, 2023
ad68a83
works locally. having trouble with CI
cperkinsintel Jan 19, 2023
5e51e12
reviewer feedback
cperkinsintel Feb 17, 2023
67d4c1b
wrap static in function
cperkinsintel Feb 20, 2023
dd7db2c
reviewer feedback
cperkinsintel Feb 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions sycl/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,7 @@ add_custom_target( sycl-toolchain
DEPENDS sycl-runtime-libraries
sycl-compiler
sycl-ls
win_proxy_loader
${XPTIFW_LIBS}
COMMENT "Building SYCL compiler toolchain..."
)
Expand Down Expand Up @@ -341,6 +342,8 @@ add_subdirectory( plugins )

add_subdirectory(tools)

add_subdirectory(win_proxy_loader)

if(SYCL_INCLUDE_TESTS)
if(NOT LLVM_INCLUDE_TESTS)
message(FATAL_ERROR
Expand Down Expand Up @@ -383,6 +386,7 @@ set( SYCL_TOOLCHAIN_DEPLOY_COMPONENTS
sycl
libsycldevice
level-zero-sycl-dev
win_proxy_loader
${XPTIFW_LIBS}
${SYCL_TOOLCHAIN_DEPS}
)
Expand Down
6 changes: 4 additions & 2 deletions sycl/include/sycl/detail/pi.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ enum TraceLevel {
bool trace(TraceLevel level);

#ifdef __SYCL_RT_OS_WINDOWS
// these same constants are used by win_proxy_loader.dll
// if a plugin is added here, add it there as well.
#ifdef _MSC_VER
#define __SYCL_OPENCL_PLUGIN_NAME "pi_opencl.dll"
#define __SYCL_LEVEL_ZERO_PLUGIN_NAME "pi_level_zero.dll"
Expand Down Expand Up @@ -150,11 +152,11 @@ __SYCL_EXPORT void contextSetExtendedDeleter(const sycl::context &constext,

// Function to load the shared library
// Implementation is OS dependent.
void *loadOsLibrary(const std::string &Library);
void *loadOsPluginLibrary(const std::string &Library);

// Function to unload the shared library
// Implementation is OS dependent (see posix-pi.cpp and windows-pi.cpp)
int unloadOsLibrary(void *Library);
int unloadOsPluginLibrary(void *Library);

// OS agnostic function to unload the shared library
int unloadPlugin(void *Library);
Expand Down
42 changes: 42 additions & 0 deletions sycl/plugins/common_win_pi_trace/common_win_pi_trace.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
//==------------ common_win_pi_trace.hpp - SYCL standard header file -------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

// this .hpp is injected. Be sure to define __SYCL_PLUGIN_DLL_NAME before
// including
#ifdef _WIN32
#include <windows.h>
BOOL WINAPI DllMain(HINSTANCE hinstDLL, // handle to DLL module
DWORD fdwReason, // reason for calling function
LPVOID lpReserved) { // reserved

bool PrintPiTrace = false;
static const char *PiTrace = std::getenv("SYCL_PI_TRACE");
static const int PiTraceValue = PiTrace ? std::stoi(PiTrace) : 0;
if (PiTraceValue == -1 || PiTraceValue == 2) { // Means print all PI traces
PrintPiTrace = true;
}

// Perform actions based on the reason for calling.
switch (fdwReason) {
case DLL_PROCESS_DETACH:
if (PrintPiTrace)
std::cout << "---> DLL_PROCESS_DETACH " << __SYCL_PLUGIN_DLL_NAME << "\n"
<< std::endl;

break;
case DLL_PROCESS_ATTACH:
if (PrintPiTrace)
std::cout << "---> DLL_PROCESS_ATTACH " << __SYCL_PLUGIN_DLL_NAME << "\n"
<< std::endl;
case DLL_THREAD_ATTACH:
case DLL_THREAD_DETACH:
break;
}
return TRUE;
}
#endif // WIN32
10 changes: 10 additions & 0 deletions sycl/plugins/cuda/pi_cuda.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5666,6 +5666,10 @@ pi_result cuda_piextEnqueueDeviceGlobalVariableRead(
}

// This API is called by Sycl RT to notify the end of the plugin lifetime.
// Windows: dynamically loaded plugins might have been unloaded already
// when this is called. Sycl RT holds onto the PI plugin so it can be
// called safely. But this is not transitive. If the PI plugin in turn
// dynamically loaded a different DLL, that may have been unloaded.
// TODO: add a global variable lifetime management code here (see
// pi_level_zero.cpp for reference) Currently this is just a NOOP.
pi_result cuda_piTearDown(void *) {
Expand Down Expand Up @@ -5862,6 +5866,12 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
return PI_SUCCESS;
}

#ifdef _WIN32
#define __SYCL_PLUGIN_DLL_NAME "pi_cuda.dll"
#include "../common_win_pi_trace/common_win_pi_trace.hpp"
#undef __SYCL_PLUGIN_DLL_NAME
#endif

} // extern "C"

CUevent _pi_platform::evBase_{nullptr};
10 changes: 10 additions & 0 deletions sycl/plugins/esimd_emulator/pi_esimd_emulator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2048,6 +2048,10 @@ pi_result piextPluginGetOpaqueData(void *, void **OpaqueDataReturn) {
return PI_SUCCESS;
}

// Windows: dynamically loaded plugins might have been unloaded already
// when this is called. Sycl RT holds onto the PI plugin so it can be
// called safely. But this is not transitive. If the PI plugin in turn
// dynamically loaded a different DLL, that may have been unloaded.
pi_result piTearDown(void *) {
delete reinterpret_cast<sycl::detail::ESIMDEmuPluginOpaqueData *>(
PiESimdDeviceAccess->data);
Expand Down Expand Up @@ -2102,4 +2106,10 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
return PI_SUCCESS;
}

#ifdef _WIN32
#define __SYCL_PLUGIN_DLL_NAME "pi_esimd_emulator.dll"
#include "../common_win_pi_trace/common_win_pi_trace.hpp"
#undef __SYCL_PLUGIN_DLL_NAME
#endif

} // extern C
10 changes: 10 additions & 0 deletions sycl/plugins/hip/pi_hip.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5321,6 +5321,10 @@ pi_result hip_piextEnqueueDeviceGlobalVariableRead(
}

// This API is called by Sycl RT to notify the end of the plugin lifetime.
// Windows: dynamically loaded plugins might have been unloaded already
// when this is called. Sycl RT holds onto the PI plugin so it can be
// called safely. But this is not transitive. If the PI plugin in turn
// dynamically loaded a different DLL, that may have been unloaded.
// TODO: add a global variable lifetime management code here (see
// pi_level_zero.cpp for reference) Currently this is just a NOOP.
pi_result hip_piTearDown(void *PluginParameter) {
Expand Down Expand Up @@ -5510,6 +5514,12 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
return PI_SUCCESS;
}

#ifdef _WIN32
#define __SYCL_PLUGIN_DLL_NAME "pi_hip.dll"
#include "../common_win_pi_trace/common_win_pi_trace.hpp"
#undef __SYCL_PLUGIN_DLL_NAME
#endif

} // extern "C"

hipEvent_t _pi_platform::evBase_{nullptr};
10 changes: 10 additions & 0 deletions sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9055,6 +9055,10 @@ pi_result piextPluginGetOpaqueData(void *opaque_data_param,
}

// SYCL RT calls this api to notify the end of plugin lifetime.
// Windows: dynamically loaded plugins might have been unloaded already
// when this is called. Sycl RT holds onto the PI plugin so it can be
// called safely. But this is not transitive. If the PI plugin in turn
// dynamically loaded a different DLL, that may have been unloaded.
// It can include all the jobs to tear down resources before
// the plugin is unloaded from memory.
pi_result piTearDown(void *PluginParameter) {
Expand Down Expand Up @@ -9438,4 +9442,10 @@ pi_result piGetDeviceAndHostTimer(pi_device Device, uint64_t *DeviceTime,
}
return PI_SUCCESS;
}

#ifdef _WIN32
#define __SYCL_PLUGIN_DLL_NAME "pi_level_zero.dll"
#include "../common_win_pi_trace/common_win_pi_trace.hpp"
#undef __SYCL_PLUGIN_DLL_NAME
#endif
} // extern "C"
10 changes: 10 additions & 0 deletions sycl/plugins/opencl/pi_opencl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1745,6 +1745,10 @@ pi_result piextKernelGetNativeHandle(pi_kernel kernel,
}

// This API is called by Sycl RT to notify the end of the plugin lifetime.
// Windows: dynamically loaded plugins might have been unloaded already
// when this is called. Sycl RT holds onto the PI plugin so it can be
// called safely. But this is not transitive. If the PI plugin in turn
// dynamically loaded a different DLL, that may have been unloaded.
// TODO: add a global variable lifetime management code here (see
// pi_level_zero.cpp for reference) Currently this is just a NOOP.
pi_result piTearDown(void *PluginParameter) {
Expand Down Expand Up @@ -1941,4 +1945,10 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
return PI_SUCCESS;
}

#ifdef _WIN32
#define __SYCL_PLUGIN_DLL_NAME "pi_opencl.dll"
#include "../common_win_pi_trace/common_win_pi_trace.hpp"
#undef __SYCL_PLUGIN_DLL_NAME
#endif

} // end extern 'C'
10 changes: 10 additions & 0 deletions sycl/source/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,14 @@ function(add_sycl_rt_library LIB_NAME LIB_OBJ_NAME)
target_link_libraries(${LIB_NAME} PRIVATE ${ARG_XPTI_LIB})
endif()

# win_proxy_loader
include_directories(${LLVM_EXTERNAL_SYCL_SOURCE_DIR}/win_proxy_loader)
if(WIN_DUPE)
target_link_libraries(${LIB_NAME} PUBLIC win_proxy_loaderd)
else()
target_link_libraries(${LIB_NAME} PUBLIC win_proxy_loader)
endif()

target_compile_definitions(${LIB_OBJ_NAME} PRIVATE __SYCL_INTERNAL_API )

if (WIN32)
Expand Down Expand Up @@ -215,11 +223,13 @@ if (MSVC)
string(REGEX REPLACE "/MT" "" ${flag_var} "${${flag_var}}")
endforeach()

set(WIN_DUPE "1")
if (SYCL_ENABLE_XPTI_TRACING)
add_sycl_rt_library(sycl${SYCL_MAJOR_VERSION}d sycld_object XPTI_LIB xptid COMPILE_OPTIONS "/MDd" SOURCES ${SYCL_SOURCES})
else()
add_sycl_rt_library(sycl${SYCL_MAJOR_VERSION}d sycld_object COMPILE_OPTIONS "/MDd" SOURCES ${SYCL_SOURCES})
endif()
unset(WIN_DUPE)
add_library(sycld ALIAS sycl${SYCL_MAJOR_VERSION}d)

set(SYCL_EXTRA_OPTS "/MD")
Expand Down
2 changes: 1 addition & 1 deletion sycl/source/detail/context_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ context_impl::~context_impl() {
}
if (!MHostContext) {
// TODO catch an exception and put it to list of asynchronous exceptions
getPlugin().call<PiApiKind::piContextRelease>(MContext);
getPlugin().call_nocheck<PiApiKind::piContextRelease>(MContext);
}
}

Expand Down
38 changes: 36 additions & 2 deletions sycl/source/detail/global_handler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,10 @@ void GlobalHandler::releaseDefaultContexts() {
// finished. To avoid calls to nowhere, intentionally leak platform to device
// cache. This will prevent destructors from being called, thus no PI cleanup
// routines will be called in the end.
// Update: the win_proxy_loader addresses this for SYCL's own dependencies,
// but the GPU device dlls seem to manually load yet another DLL which may
// have been released when this function is called. So we still release() and
// leak until that is addressed. context destructs fine on CPU device.
MPlatformToDefaultContextCache.Inst.release();
#endif
}
Expand Down Expand Up @@ -212,6 +216,18 @@ void GlobalHandler::drainThreadPool() {
MHostTaskThreadPool.Inst->drain();
}

#ifdef _WIN32
// because of something not-yet-understood on Windows
// threads may be shutdown once the end of main() is reached
// making an orderly shutdown difficult. Fortunately, Windows
// itself is very aggressive about reclaiming memory. Thus,
// we focus solely on unloading the plugins, so as to not
// accidentally retain device handles. etc
void shutdown(){
GlobalHandler *&Handler = GlobalHandler::getInstancePtr();
Handler->unloadPlugins();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a comment saying that piTearDown might not be safe to call low level API, since there might be dependent libraries that are unloaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to each definition of piTearDown in the plugin libs.

}
#else
void shutdown() {
const LockGuard Lock{GlobalHandler::MSyclGlobalHandlerProtector};
GlobalHandler *&Handler = GlobalHandler::getInstancePtr();
Expand Down Expand Up @@ -246,18 +262,36 @@ void shutdown() {
delete Handler;
Handler = nullptr;
}
#endif

#ifdef _WIN32
extern "C" __SYCL_EXPORT BOOL WINAPI DllMain(HINSTANCE hinstDLL,
DWORD fdwReason,
LPVOID lpReserved) {
bool PrintPiTrace = false;
static const char *PiTrace = std::getenv("SYCL_PI_TRACE");
static const int PiTraceValue = PiTrace ? std::stoi(PiTrace) : 0;
if (PiTraceValue == -1 || PiTraceValue == 2) { // Means print all PI traces
PrintPiTrace = true;
}

// Perform actions based on the reason for calling.
switch (fdwReason) {
case DLL_PROCESS_DETACH:
if (!lpReserved)
shutdown();
if (PrintPiTrace)
std::cout << "---> DLL_PROCESS_DETACH syclx.dll\n" << std::endl;

#ifdef XPTI_ENABLE_INSTRUMENTATION
if (xptiTraceEnabled())
return TRUE; // When doing xpti tracing, we can't safely call shutdown.
// TODO: figure out what XPTI is doing that prevents release.
#endif

shutdown();
break;
case DLL_PROCESS_ATTACH:
if (PrintPiTrace)
std::cout << "---> DLL_PROCESS_ATTACH syclx.dll\n" << std::endl;
case DLL_THREAD_ATTACH:
case DLL_THREAD_DETACH:
break;
Expand Down
2 changes: 1 addition & 1 deletion sycl/source/detail/online_compiler/online_compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ compileToSPIRV(const std::string &Source, sycl::info::device_type DeviceType,
#else
static const std::string OclocLibraryName = "libocloc.so";
#endif
void *OclocLibrary = sycl::detail::pi::loadOsLibrary(OclocLibraryName);
void *OclocLibrary = sycl::detail::pi::loadOsPluginLibrary(OclocLibraryName);
if (!OclocLibrary)
throw online_compile_error("Cannot load ocloc library: " +
OclocLibraryName);
Expand Down
2 changes: 2 additions & 0 deletions sycl/source/detail/os_util.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,8 @@ OSModuleHandle OSUtil::getOSModuleHandle(const void *VirtAddr) {
}

/// Returns an absolute path where the object was found.
// win_proxy_loader.dll uses this same logic. If it is changed
// significantly, it might be wise to change it there too.
std::string OSUtil::getCurrentDSODir() {
char Path[MAX_PATH];
Path[0] = '\0';
Expand Down
4 changes: 2 additions & 2 deletions sycl/source/detail/pi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -364,12 +364,12 @@ std::vector<std::pair<std::string, backend>> findPlugins() {
// Load the Plugin by calling the OS dependent library loading call.
// Return the handle to the Library.
void *loadPlugin(const std::string &PluginPath) {
return loadOsLibrary(PluginPath);
return loadOsPluginLibrary(PluginPath);
}

// Unload the given plugin by calling teh OS-specific library unloading call.
// \param Library OS-specific library handle created when loading.
int unloadPlugin(void *Library) { return unloadOsLibrary(Library); }
int unloadPlugin(void *Library) { return unloadOsPluginLibrary(Library); }

// Binds all the PI Interface APIs to Plugin Library Function Addresses.
// TODO: Remove the 'OclPtr' extension to PI_API.
Expand Down
4 changes: 2 additions & 2 deletions sycl/source/detail/posix_pi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ namespace sycl {
__SYCL_INLINE_VER_NAMESPACE(_V1) {
namespace detail::pi {

void *loadOsLibrary(const std::string &PluginPath) {
void *loadOsPluginLibrary(const std::string &PluginPath) {
// TODO: Check if the option RTLD_NOW is correct. Explore using
// RTLD_DEEPBIND option when there are multiple plugins.
void *so = dlopen(PluginPath.c_str(), RTLD_NOW);
Expand All @@ -28,7 +28,7 @@ void *loadOsLibrary(const std::string &PluginPath) {
return so;
}

int unloadOsLibrary(void *Library) {
int unloadOsPluginLibrary(void *Library) {
// The mock plugin does not have an associated library, so we allow nullptr
// here to avoid it trying to free a non-existent library.
if (!Library)
Expand Down
Loading