Skip to content

Commit f58c568

Browse files
authored
[SYCL][L0] Implement Plugin Lifetime Management (#2942)
The patch introduces new PI API - piTearDown which is called by the SYCL RT prior to plugin library unloading. Global variables are now released in the level zero plugin. Signed-off-by: Byoungro So <[email protected]>
1 parent 08a1c00 commit f58c568

File tree

17 files changed

+118
-24
lines changed

17 files changed

+118
-24
lines changed

sycl/doc/GlobalObjectsInRuntime.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,14 @@ constructor and destructor.
8888

8989
## Plugins
9090

91-
TBD
91+
Plugin lifetime is managed by utilizing piPluginInit() and piTearDown().
92+
GlobalHandler::shutdown() will tear down all registered globals before SYCL RT
93+
library is unloaded. It will invoke piTearDown() and unload() for each
94+
plugin. piTearDown() is going to perform any necessary tear-down process at the
95+
plugin PI level. These two APIs allow on-demand plugin lifetime management. SYCL
96+
RT can control the beginning and the end of the plugin.
97+
98+
![](images/plugin-lifetime.jpg)
9299

93100
## Low-level runtimes
94101

sycl/doc/PluginInterface.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -121,9 +121,22 @@ The trace shows the PI API calls made when using SYCL_PI_TRACE=-1.
121121
bound.)
122122

123123
### Plugin Unloading
124-
The plugins not chosen to be connected to should be unloaded.
125-
126-
TBD - Unloading a bound plugin.
124+
The plugins not chosen to be connected to should be unloaded. piInitializePlugins()
125+
can be called to load and bound the necessary plugins. In addition, piTearDown()
126+
can be called when plugins are not needed any more. It notifies each
127+
plugin to start performing its own tear-down process such as global memory
128+
deallocation. In the future, piTearDown() can include any other jobs that need to
129+
be done before the plugin is unloaded from memory. Possibly, a
130+
notification of the plugin unloading to lower-level plugins can be added so that
131+
they can clean up their own memory [TBD].
132+
After piTearDown() is called, the plugin can be safely unloaded by calling unload(),
133+
which is going to invoke OS-specific system calls to remove the dynamic library
134+
from memory.
135+
136+
Each plugin should not create global variables that require non-trivial
137+
destructor. Pointer variables with heap memory allocation is a good example
138+
to be created at the global scope. A std::vector object is not. piTearDown
139+
will take care of deallocation of these global variables safely.
127140

128141
## PI API Specification
129142

sycl/doc/images/plugin-lifetime.jpg

33.1 KB
Loading

sycl/include/CL/sycl/detail/pi.def

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,5 +126,6 @@ _PI_API(piextUSMGetMemAllocInfo)
126126

127127
_PI_API(piextKernelSetArgMemObj)
128128
_PI_API(piextKernelSetArgSampler)
129+
_PI_API(piTearDown)
129130

130131
#undef _PI_API

sycl/include/CL/sycl/detail/pi.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1600,6 +1600,11 @@ __SYCL_EXPORT pi_result piextUSMGetMemAllocInfo(
16001600
pi_context context, const void *ptr, pi_mem_info param_name,
16011601
size_t param_value_size, void *param_value, size_t *param_value_size_ret);
16021602

1603+
/// API to notify that the plugin should clean up its resources.
1604+
/// No PI calls should be made until the next piPluginInit call.
1605+
/// \param PluginParameter placeholder for future use, currenly not used.
1606+
__SYCL_EXPORT pi_result piTearDown(void *PluginParameter);
1607+
16031608
struct _pi_plugin {
16041609
// PI version supported by host passed to the plugin. The Plugin
16051610
// checks and writes the appropriate Function Pointers in

sycl/include/CL/sycl/detail/pi.hpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,13 @@ __SYCL_EXPORT void contextSetExtendedDeleter(const cl::sycl::context &constext,
123123
// Implementation is OS dependent.
124124
void *loadOsLibrary(const std::string &Library);
125125

126+
// Function to unload the shared library
127+
// Implementation is OS dependent (see posix-pi.cpp and windows-pi.cpp)
128+
int unloadOsLibrary(void *Library);
129+
130+
// OS agnostic function to unload the shared library
131+
int unloadPlugin(void *Library);
132+
126133
// Function to get Address of a symbol defined in the shared
127134
// library, implementation is OS dependent.
128135
void *getOsLibraryFuncAddress(void *Library, const std::string &FunctionName);

sycl/plugins/cuda/pi_cuda.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4475,6 +4475,11 @@ pi_result cuda_piextUSMGetMemAllocInfo(pi_context context, const void *ptr,
44754475
return result;
44764476
}
44774477

4478+
// This API is called by Sycl RT to notify the end of the plugin lifetime.
4479+
// TODO: add a global variable lifetime management code here (see
4480+
// pi_level_zero.cpp for reference) Currently this is just a NOOP.
4481+
pi_result cuda_piTearDown(void *PluginParameter) { return PI_SUCCESS; }
4482+
44784483
const char SupportedVersion[] = _PI_H_VERSION_STRING;
44794484

44804485
pi_result piPluginInit(pi_plugin *PluginInit) {
@@ -4610,6 +4615,7 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
46104615

46114616
_PI_CL(piextKernelSetArgMemObj, cuda_piextKernelSetArgMemObj)
46124617
_PI_CL(piextKernelSetArgSampler, cuda_piextKernelSetArgSampler)
4618+
_PI_CL(piTearDown, cuda_piTearDown)
46134619

46144620
#undef _PI_CL
46154621

sycl/plugins/level_zero/pi_level_zero.cpp

Lines changed: 28 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
/// \ingroup sycl_pi_level_zero
1313

1414
#include "pi_level_zero.hpp"
15+
#include <CL/sycl/detail/spinlock.hpp>
1516
#include <algorithm>
1617
#include <cstdarg>
1718
#include <cstdio>
@@ -172,6 +173,17 @@ class ReturnHelper {
172173

173174
} // anonymous namespace
174175

176+
// Global variables used in PI_Level_Zero
177+
// Note we only create a simple pointer variables such that C++ RT won't
178+
// deallocate them automatically at the end of the main program.
179+
// The heap memory allocated for these global variables reclaimed only when
180+
// Sycl RT calls piTearDown().
181+
static std::vector<pi_platform> *PiPlatformsCache =
182+
new std::vector<pi_platform>;
183+
static sycl::detail::SpinLock *PiPlatformsCacheMutex =
184+
new sycl::detail::SpinLock;
185+
static bool PiPlatformCachePopulated = false;
186+
175187
// TODO:: In the following 4 methods we may want to distinguish read access vs.
176188
// write (as it is OK for multiple threads to read the map without locking it).
177189

@@ -821,16 +833,8 @@ pi_result piPlatformsGet(pi_uint32 NumEntries, pi_platform *Platforms,
821833
// 1. sycl::platform equality issue; we always return the same pi_platform.
822834
// 2. performance; we can save time by immediately return from cache.
823835
//
824-
// Note: The memory for "PiPlatformsCache" and "PiPlatformsCacheMutex" is
825-
// intentionally leaked because the application may call into the SYCL
826-
// runtime from a global destructor, and such a call could eventually
827-
// access these variables. Therefore, there is no safe time when
828-
// "PiPlatformsCache" and "PiPlatformsCacheMutex" could be deleted.
829-
static auto PiPlatformsCache = new std::vector<pi_platform>;
830-
static auto PiPlatformsCacheMutex = new std::mutex;
831-
static bool PiPlatformCachePopulated = false;
832-
833-
std::lock_guard<std::mutex> Lock(*PiPlatformsCacheMutex);
836+
837+
const std::lock_guard<sycl::detail::SpinLock> Lock{*PiPlatformsCacheMutex};
834838
if (!PiPlatformCachePopulated) {
835839
const char *CommandListCacheSize =
836840
std::getenv("SYCL_PI_LEVEL_ZERO_MAX_COMMAND_LIST_CACHE");
@@ -5349,4 +5353,18 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
53495353
return PI_SUCCESS;
53505354
}
53515355

5356+
// SYCL RT calls this api to notify the end of plugin lifetime.
5357+
// It can include all the jobs to tear down resources before
5358+
// the plugin is unloaded from memory.
5359+
pi_result piTearDown(void *PluginParameter) {
5360+
// reclaim pi_platform objects here since we don't have piPlatformRelease.
5361+
for (pi_platform &Platform : *PiPlatformsCache) {
5362+
delete Platform;
5363+
}
5364+
delete PiPlatformsCache;
5365+
delete PiPlatformsCacheMutex;
5366+
5367+
return PI_SUCCESS;
5368+
}
5369+
53525370
} // extern "C"

sycl/plugins/opencl/pi_opencl.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1170,6 +1170,11 @@ pi_result piextProgramGetNativeHandle(pi_program program,
11701170
return piextGetNativeHandle(program, nativeHandle);
11711171
}
11721172

1173+
// This API is called by Sycl RT to notify the end of the plugin lifetime.
1174+
// TODO: add a global variable lifetime management code here (see
1175+
// pi_level_zero.cpp for reference) Currently this is just a NOOP.
1176+
pi_result piTearDown(void *PluginParameter) { return PI_SUCCESS; }
1177+
11731178
pi_result piPluginInit(pi_plugin *PluginInit) {
11741179
int CompareVersions = strcmp(PluginInit->PiVersion, SupportedVersion);
11751180
if (CompareVersions < 0) {
@@ -1297,6 +1302,7 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
12971302

12981303
_PI_CL(piextKernelSetArgMemObj, piextKernelSetArgMemObj)
12991304
_PI_CL(piextKernelSetArgSampler, piextKernelSetArgSampler)
1305+
_PI_CL(piTearDown, piTearDown)
13001306

13011307
#undef _PI_CL
13021308

sycl/source/detail/global_handler.cpp

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
//===----------------------------------------------------------------------===//
88

99
#include <CL/sycl/detail/device_filter.hpp>
10+
#include <CL/sycl/detail/pi.hpp>
1011
#include <CL/sycl/detail/spinlock.hpp>
1112
#include <detail/global_handler.hpp>
1213
#include <detail/platform_impl.hpp>
@@ -113,7 +114,18 @@ GlobalHandler::getDeviceFilterList(const std::string &InitValue) {
113114
return *MDeviceFilterList;
114115
}
115116

116-
void shutdown() { delete &GlobalHandler::instance(); }
117+
void shutdown() {
118+
for (plugin &Plugin : GlobalHandler::instance().getPlugins()) {
119+
// PluginParameter is reserved for future use that can control
120+
// some parameters in the plugin tear-down process.
121+
// Currently, it is not used.
122+
void *PluginParameter = nullptr;
123+
Plugin.call_nocheck<PiApiKind::piTearDown>(PluginParameter);
124+
Plugin.unload();
125+
}
126+
127+
delete &GlobalHandler::instance();
128+
}
117129

118130
#ifdef _WIN32
119131
BOOL WINAPI DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved) {

sycl/source/detail/pi.cpp

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -256,6 +256,10 @@ void *loadPlugin(const std::string &PluginPath) {
256256
return loadOsLibrary(PluginPath);
257257
}
258258

259+
// Unload the given plugin by calling teh OS-specific library unloading call.
260+
// \param Library OS-specific library handle created when loading.
261+
int unloadPlugin(void *Library) { return unloadOsLibrary(Library); }
262+
259263
// Binds all the PI Interface APIs to Plugin Library Function Addresses.
260264
// TODO: Remove the 'OclPtr' extension to PI_API.
261265
// TODO: Change the functionality such that a single getOsLibraryFuncAddress
@@ -339,18 +343,20 @@ static void initializePlugins(vector_class<plugin> *Plugins) {
339343
PluginNames[I].first.find("opencl") != std::string::npos) {
340344
// Use the OpenCL plugin as the GlobalPlugin
341345
GlobalPlugin =
342-
std::make_shared<plugin>(PluginInformation, backend::opencl);
346+
std::make_shared<plugin>(PluginInformation, backend::opencl, Library);
343347
} else if (InteropBE == backend::cuda &&
344348
PluginNames[I].first.find("cuda") != std::string::npos) {
345349
// Use the CUDA plugin as the GlobalPlugin
346-
GlobalPlugin = std::make_shared<plugin>(PluginInformation, backend::cuda);
350+
GlobalPlugin =
351+
std::make_shared<plugin>(PluginInformation, backend::cuda, Library);
347352
} else if (InteropBE == backend::level_zero &&
348353
PluginNames[I].first.find("level_zero") != std::string::npos) {
349354
// Use the LEVEL_ZERO plugin as the GlobalPlugin
350-
GlobalPlugin =
351-
std::make_shared<plugin>(PluginInformation, backend::level_zero);
355+
GlobalPlugin = std::make_shared<plugin>(PluginInformation,
356+
backend::level_zero, Library);
352357
}
353-
Plugins->emplace_back(plugin(PluginInformation, PluginNames[I].second));
358+
Plugins->emplace_back(
359+
plugin(PluginInformation, PluginNames[I].second, Library));
354360
if (trace(TraceLevel::PI_TRACE_BASIC))
355361
std::cerr << "SYCL_PI_TRACE[basic]: "
356362
<< "Plugin found and successfully loaded: "

sycl/source/detail/plugin.hpp

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,8 @@ class plugin {
3434
public:
3535
plugin() = delete;
3636

37-
plugin(RT::PiPlugin Plugin, backend UseBackend)
38-
: MPlugin(Plugin), MBackend(UseBackend),
37+
plugin(RT::PiPlugin Plugin, backend UseBackend, void *LibraryHandle)
38+
: MPlugin(Plugin), MBackend(UseBackend), MLibraryHandle(LibraryHandle),
3939
TracingMutex(std::make_shared<std::mutex>()) {}
4040

4141
plugin &operator=(const plugin &) = default;
@@ -107,10 +107,14 @@ class plugin {
107107
}
108108

109109
backend getBackend(void) const { return MBackend; }
110+
void *getLibraryHandle() const { return MLibraryHandle; }
111+
void *getLibraryHandle() { return MLibraryHandle; }
112+
int unload() { return RT::unloadPlugin(MLibraryHandle); }
110113

111114
private:
112115
RT::PiPlugin MPlugin;
113116
backend MBackend;
117+
void *MLibraryHandle; // the handle returned from dlopen
114118
std::shared_ptr<std::mutex> TracingMutex;
115119
}; // class plugin
116120
} // namespace detail

sycl/source/detail/posix_pi.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ void *loadOsLibrary(const std::string &PluginPath) {
2222
return dlopen(PluginPath.c_str(), RTLD_NOW);
2323
}
2424

25+
int unloadOsLibrary(void *Library) { return dlclose(Library); }
26+
2527
void *getOsLibraryFuncAddress(void *Library, const std::string &FunctionName) {
2628
return dlsym(Library, FunctionName.c_str());
2729
}

sycl/source/detail/windows_pi.cpp

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@
88

99
#include <CL/sycl/detail/defines.hpp>
1010

11+
#include <string>
1112
#include <windows.h>
1213
#include <winreg.h>
13-
#include <string>
1414

1515
__SYCL_INLINE_NAMESPACE(cl) {
1616
namespace sycl {
@@ -21,6 +21,10 @@ void *loadOsLibrary(const std::string &PluginPath) {
2121
return (void *)LoadLibraryA(PluginPath.c_str());
2222
}
2323

24+
int unloadOsLibrary(void *Library) {
25+
return (int)FreeLibrary((HMODULE)Library);
26+
}
27+
2428
void *getOsLibraryFuncAddress(void *Library, const std::string &FunctionName) {
2529
return reinterpret_cast<void *>(
2630
GetProcAddress((HMODULE)Library, FunctionName.c_str()));

sycl/test/abi/pi_level_zero_symbol_check.dump

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ piSamplerCreate
7676
piSamplerGetInfo
7777
piSamplerRelease
7878
piSamplerRetain
79+
piTearDown
7980
piclProgramCreateWithSource
8081
piextContextCreateWithNativeHandle
8182
piextContextGetNativeHandle

sycl/test/abi/pi_opencl_symbol_check.dump

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ piProgramCreateWithBinary
2424
piProgramLink
2525
piQueueCreate
2626
piSamplerCreate
27+
piTearDown
2728
piclProgramCreateWithSource
2829
piextContextCreateWithNativeHandle
2930
piextContextGetNativeHandle

sycl/unittests/helpers/PiMock.hpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,8 @@ class PiMock {
121121
// Copy the PiPlugin, thus untying our to-be mock platform from other
122122
// platforms within the context. Reset our platform to use the new plugin.
123123
auto NewPluginPtr = std::make_shared<detail::plugin>(
124-
OriginalPiPlugin.getPiPlugin(), OriginalPiPlugin.getBackend());
124+
OriginalPiPlugin.getPiPlugin(), OriginalPiPlugin.getBackend(),
125+
OriginalPiPlugin.getLibraryHandle());
125126
ImplPtr->setPlugin(NewPluginPtr);
126127
// Extract the new PiPlugin instance by a non-const pointer,
127128
// explicitly allowing modification

0 commit comments

Comments
 (0)