-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL] Wrap complex global objects to control lifetime #2516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
romanovvlad
merged 26 commits into
intel:sycl
from
alexbatashev:global_constructors_wrapper
Oct 13, 2020
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
f4b6b31
Create simple wrapper for Scheduler and ProgramManager
8cc33b6
Handle more objects
74298f0
Add some documentation
9e5b0a3
clang-format
1dff725
Save progress
2a0d2db
Use pointers to narrow lifetime
8e309ed
Fix some comments
f958104
clang-format
d01d49f
Minor improvements
48b6ea3
Address a few comments
026ea4e
Merge remote-tracking branch 'upstream/sycl' into global_constructors…
7db4dc1
More fixes
ba84ef1
Use atomic_flag
8626817
Apply clang-format
3efb877
Correctly use atomic_flag
6aec7b8
Improve var naming
b5d21f5
Bugfix
3e0947e
Disable complex global object test on l0 and cuda
6600e39
Use magic static
f9a0263
Eliminate global object in config.cpp
e4b7866
Merge remote-tracking branch 'upstream/sycl' into global_constructors…
e55e576
w/a problems in OpenCL runtimes
409db9a
Fix includes to break cyclic dependencies
4eb4398
clang-format
a0eeeff
Merge branch 'sycl' into global_constructors_wrapper
55dd6ce
Fix tests
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
//==-------------------- spinlock.hpp --- Spin lock ------------------------==// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#pragma once | ||
|
||
#include <CL/sycl/detail/defines.hpp> | ||
|
||
#include <atomic> | ||
#include <thread> | ||
|
||
__SYCL_INLINE_NAMESPACE(cl) { | ||
namespace sycl { | ||
namespace detail { | ||
/// SpinLock is a synchronization primitive, that uses atomic variable and | ||
/// causes thread trying acquire lock wait in loop while repeatedly check if | ||
/// the lock is available. | ||
/// | ||
/// One important feature of this implementation is that std::atomic<bool> can | ||
/// be zero-initialized. This allows SpinLock to have trivial constructor and | ||
/// destructor, which makes it possible to use it in global context (unlike | ||
/// std::mutex, that doesn't provide such guarantees). | ||
class SpinLock { | ||
public: | ||
void lock() { | ||
while (MLock.test_and_set(std::memory_order_acquire)) | ||
std::this_thread::yield(); | ||
} | ||
void unlock() { MLock.clear(std::memory_order_release); } | ||
|
||
private: | ||
std::atomic_flag MLock{ATOMIC_FLAG_INIT}; | ||
}; | ||
} // namespace detail | ||
} // namespace sycl | ||
} // __SYCL_INLINE_NAMESPACE(cl) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,139 @@ | ||
//==--------- global_handler.cpp --- Global objects handler ----------------==// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#include <CL/sycl/detail/device_filter.hpp> | ||
#include <CL/sycl/detail/spinlock.hpp> | ||
#include <detail/global_handler.hpp> | ||
#include <detail/platform_impl.hpp> | ||
#include <detail/plugin.hpp> | ||
#include <detail/program_manager/program_manager.hpp> | ||
#include <detail/scheduler/scheduler.hpp> | ||
|
||
#ifdef WIN32 | ||
#include <windows.h> | ||
#endif | ||
|
||
#include <vector> | ||
|
||
__SYCL_INLINE_NAMESPACE(cl) { | ||
namespace sycl { | ||
namespace detail { | ||
GlobalHandler::GlobalHandler() = default; | ||
GlobalHandler::~GlobalHandler() = default; | ||
|
||
GlobalHandler &GlobalHandler::instance() { | ||
static GlobalHandler *SyclGlobalObjectsHandler = new GlobalHandler(); | ||
return *SyclGlobalObjectsHandler; | ||
} | ||
|
||
Scheduler &GlobalHandler::getScheduler() { | ||
if (MScheduler) | ||
return *MScheduler; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MScheduler) | ||
MScheduler = std::make_unique<Scheduler>(); | ||
|
||
return *MScheduler; | ||
} | ||
ProgramManager &GlobalHandler::getProgramManager() { | ||
if (MProgramManager) | ||
return *MProgramManager; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MProgramManager) | ||
MProgramManager = std::make_unique<ProgramManager>(); | ||
|
||
return *MProgramManager; | ||
} | ||
Sync &GlobalHandler::getSync() { | ||
if (MSync) | ||
return *MSync; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MSync) | ||
MSync = std::make_unique<Sync>(); | ||
|
||
return *MSync; | ||
} | ||
std::vector<PlatformImplPtr> &GlobalHandler::getPlatformCache() { | ||
if (MPlatformCache) | ||
return *MPlatformCache; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MPlatformCache) | ||
MPlatformCache = std::make_unique<std::vector<PlatformImplPtr>>(); | ||
|
||
return *MPlatformCache; | ||
} | ||
std::mutex &GlobalHandler::getPlatformMapMutex() { | ||
if (MPlatformMapMutex) | ||
return *MPlatformMapMutex; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MPlatformMapMutex) | ||
MPlatformMapMutex = std::make_unique<std::mutex>(); | ||
|
||
return *MPlatformMapMutex; | ||
} | ||
std::mutex &GlobalHandler::getFilterMutex() { | ||
if (MFilterMutex) | ||
return *MFilterMutex; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MFilterMutex) | ||
MFilterMutex = std::make_unique<std::mutex>(); | ||
|
||
return *MFilterMutex; | ||
} | ||
std::vector<plugin> &GlobalHandler::getPlugins() { | ||
if (MPlugins) | ||
return *MPlugins; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MPlugins) | ||
MPlugins = std::make_unique<std::vector<plugin>>(); | ||
|
||
return *MPlugins; | ||
} | ||
device_filter_list & | ||
GlobalHandler::getDeviceFilterList(const std::string &InitValue) { | ||
if (MDeviceFilterList) | ||
return *MDeviceFilterList; | ||
|
||
const std::lock_guard<SpinLock> Lock{MFieldsLock}; | ||
if (!MDeviceFilterList) | ||
MDeviceFilterList = std::make_unique<device_filter_list>(InitValue); | ||
|
||
return *MDeviceFilterList; | ||
} | ||
|
||
void shutdown() { delete &GlobalHandler::instance(); } | ||
|
||
#ifdef WIN32 | ||
BOOL WINAPI DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved) { | ||
// Perform actions based on the reason for calling. | ||
switch (fdwReason) { | ||
case DLL_PROCESS_DETACH: | ||
shutdown(); | ||
break; | ||
case DLL_PROCESS_ATTACH: | ||
case DLL_THREAD_ATTACH: | ||
case DLL_THREAD_DETACH: | ||
break; | ||
} | ||
return TRUE; // Successful DLL_PROCESS_ATTACH. | ||
} | ||
#else | ||
// Setting maximum priority on destructor ensures it runs after all other global | ||
// destructors. | ||
__attribute__((destructor(65535))) static void syclUnload() { shutdown(); } | ||
#endif | ||
} // namespace detail | ||
} // namespace sycl | ||
} // __SYCL_INLINE_NAMESPACE(cl) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
//==--------- global_handler.hpp --- Global objects handler ----------------==// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#pragma once | ||
|
||
#include <CL/sycl/detail/spinlock.hpp> | ||
#include <CL/sycl/detail/util.hpp> | ||
|
||
#include <memory> | ||
|
||
__SYCL_INLINE_NAMESPACE(cl) { | ||
namespace sycl { | ||
namespace detail { | ||
class platform_impl; | ||
class Scheduler; | ||
class ProgramManager; | ||
class Sync; | ||
class plugin; | ||
class device_filter_list; | ||
|
||
using PlatformImplPtr = std::shared_ptr<platform_impl>; | ||
|
||
/// Wrapper class for global data structures with non-trivial destructors. | ||
/// | ||
/// As user code can call SYCL Runtime functions from destructor of global | ||
/// objects, it is not safe for the runtime library to have global objects with | ||
/// non-trivial destructors. Such destructors can be called any time after | ||
/// exiting main, which may result in user application crashes. Instead, | ||
/// complex global objects must be wrapped into GlobalHandler. Its instance | ||
/// is stored on heap, and deallocated when the runtime library is being | ||
/// unloaded. | ||
/// | ||
/// There's no need to store trivial globals here, as no code for their | ||
/// construction or destruction is generated anyway. | ||
class GlobalHandler { | ||
public: | ||
/// \return a reference to a GlobalHandler singleton instance. Memory for | ||
/// storing objects is allocated on first call. The reference is valid as long | ||
/// as runtime library is loaded (i.e. untill `DllMain` or | ||
/// `__attribute__((destructor))` is called). | ||
static GlobalHandler &instance(); | ||
|
||
GlobalHandler(const GlobalHandler &) = delete; | ||
GlobalHandler(GlobalHandler &&) = delete; | ||
|
||
Scheduler &getScheduler(); | ||
ProgramManager &getProgramManager(); | ||
Sync &getSync(); | ||
std::vector<PlatformImplPtr> &getPlatformCache(); | ||
std::mutex &getPlatformMapMutex(); | ||
std::mutex &getFilterMutex(); | ||
std::vector<plugin> &getPlugins(); | ||
device_filter_list &getDeviceFilterList(const std::string &InitValue); | ||
|
||
private: | ||
friend void shutdown(); | ||
// Constructor and destructor are declared out-of-line to allow incomplete | ||
// types as template arguments to unique_ptr. | ||
GlobalHandler(); | ||
~GlobalHandler(); | ||
|
||
SpinLock MFieldsLock; | ||
|
||
std::unique_ptr<Scheduler> MScheduler; | ||
std::unique_ptr<ProgramManager> MProgramManager; | ||
std::unique_ptr<Sync> MSync; | ||
std::unique_ptr<std::vector<PlatformImplPtr>> MPlatformCache; | ||
std::unique_ptr<std::mutex> MPlatformMapMutex; | ||
std::unique_ptr<std::mutex> MFilterMutex; | ||
std::unique_ptr<std::vector<plugin>> MPlugins; | ||
std::unique_ptr<device_filter_list> MDeviceFilterList; | ||
}; | ||
} // namespace detail | ||
} // namespace sycl | ||
} // __SYCL_INLINE_NAMESPACE(cl) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ | |
#include <CL/sycl/detail/device_filter.hpp> | ||
#include <CL/sycl/detail/pi.hpp> | ||
#include <detail/config.hpp> | ||
#include <detail/global_handler.hpp> | ||
#include <detail/plugin.hpp> | ||
|
||
#include <bitset> | ||
|
@@ -283,18 +284,12 @@ bool trace(TraceLevel Level) { | |
// Initializes all available Plugins. | ||
const vector_class<plugin> &initialize() { | ||
static std::once_flag PluginsInitDone; | ||
static vector_class<plugin> *Plugins = nullptr; | ||
|
||
std::call_once(PluginsInitDone, []() { | ||
// The memory for "Plugins" is intentionally leaked because the application | ||
// may call into the SYCL runtime from a global destructor, and such a call | ||
// could eventually call down to initialize(). Therefore, there is no safe | ||
// time when "Plugins" could be deleted. | ||
Plugins = new vector_class<plugin>; | ||
initializePlugins(Plugins); | ||
initializePlugins(&GlobalHandler::instance().getPlugins()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: why repeat calls here instead of remembering the return value? |
||
}); | ||
|
||
return *Plugins; | ||
return GlobalHandler::instance().getPlugins(); | ||
} | ||
|
||
static void initializePlugins(vector_class<plugin> *Plugins) { | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid truly knowing/enumerating all the objects handled in this wrapper, and just make it be a custom heap where arbitrary objects can be dealt with overloaded new/delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smaslov-intel I was thinking about it. If we go with some custom memory space, we'd need some mechanism to call destructors on that memory. We'd either need to somehow store custom deleters and call them upon shutdown, or implement some other logic, where object is responsible for its destruction. Both look quite complicated solutions. Also, it's not clear how to access these objects. Use names and store pointers in a map? So, I decided to keep global wrapper with unique pointers for now. Adding a new global object shouldn't be that hard, and I don't expect us to add a new global object every other Tuesday as they're code smell anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is badly violating OOP encapsulation principle, I feel.
Also it makes it nearly impossible to use this global handler in the plugins (standalone parts of SYCL RT).
And then anything else we use for plugins would compete with 65535 destruction priority used for this data.
Having said that I admit this is definitely an improvement and we should go with it until something better is developed.