Skip to content

Draft: [SYCL] Implement resource pool for implementation allocations #5662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions sycl/doc/EnvironmentVariables.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ compiler and runtime.
| `SYCL_DEVICE_TYPE` (deprecated) | CPU, GPU, ACC, HOST | Force SYCL to use the specified device type. If unset, default selection rules are applied. If set to any unlisted value, this control has no effect. If the requested device type is not found, a `cl::sycl::runtime_error` exception is thrown. If a non-default device selector is used, a device must satisfy both the selector and this control to be chosen. This control only has effect on devices created with a selector. We are planning to deprecate `SYCL_DEVICE_TYPE` environment variable in the future. The specific grace period is not decided yet. Please use the new env var `SYCL_DEVICE_FILTER` instead. |
| `SYCL_DEVICE_FILTER` | `backend:device_type:device_num` | See Section [`SYCL_DEVICE_FILTER`](#sycl_device_filter) below. |
| `SYCL_DEVICE_ALLOWLIST` | See [below](#sycl_device_allowlist) | Filter out devices that do not match the pattern specified. `BackendName` accepts `host`, `opencl`, `level_zero` or `cuda`. `DeviceType` accepts `host`, `cpu`, `gpu` or `acc`. `DeviceVendorId` accepts uint32_t in hex form (`0xXYZW`). `DriverVersion`, `PlatformVersion`, `DeviceName` and `PlatformName` accept regular expression. Special characters, such as parenthesis, must be escaped. DPC++ runtime will select only those devices which satisfy provided values above and regex. More than one device can be specified using the piping symbol "\|".|
| `SYCL_DISABLE_AUXILIARY_RESOURCE_POOL` | Any(\*) | Disables the auxiliary resource pool, preventing the reuse of device resources by operations like reductions. |
| `SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING` | Any(\*) | Disables automatic rounding-up of `parallel_for` invocation ranges. |
| `SYCL_CACHE_DIR` | Path | Path to persistent cache root directory. Default values are `%AppData%\libsycl_cache` for Windows and `$XDG_CACHE_HOME/libsycl_cache` on Linux, if `XDG_CACHE_HOME` is not set then `$HOME/.cache/libsycl_cache`. When none of the environment variables are set SYCL persistent cache is disabled. |
| `SYCL_CACHE_DISABLE_PERSISTENT (deprecated)` | Any(\*) | Has no effect. |
Expand Down
4 changes: 4 additions & 0 deletions sycl/include/CL/sycl/buffer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ class queue;
template <int dimensions> class range;

namespace detail {
template <typename T, int Dims> struct ManagedResource;

template <typename T, int Dimensions, typename AllocatorT>
buffer<T, Dimensions, AllocatorT, void>
make_buffer_helper(pi_native_handle Handle, const context &Ctx, event Evt) {
Expand Down Expand Up @@ -532,6 +534,8 @@ class buffer {
template <typename HT, int HDims, typename HAllocT>
friend buffer<HT, HDims, HAllocT, void>
detail::make_buffer_helper(pi_native_handle, const context &, event);
template <typename RT, int RDims> friend struct detail::ManagedResource;

range<dimensions> Range;
// Offset field specifies the origin of the sub buffer inside the parent
// buffer
Expand Down
7 changes: 7 additions & 0 deletions sycl/include/CL/sycl/detail/buffer_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,13 @@ class __SYCL_EXPORT buffer_impl final : public SYCLMemObjT {
: BaseT(MemObject, SyclContext, SizeInBytes, std::move(AvailableEvent),
std::move(Allocator)) {}

buffer_impl(RT::PiMem MemObject, const context &SyclContext,
const size_t SizeInBytes,
std::unique_ptr<SYCLMemObjAllocator> Allocator,
event AvailableEvent)
: BaseT(MemObject, SyclContext, SizeInBytes, std::move(AvailableEvent),
std::move(Allocator)) {}

void *allocateMem(ContextImplPtr Context, bool InitFromUserData,
void *HostPtr, RT::PiEvent &OutEventToWait) override;
void constructorNotification(const detail::code_location &CodeLoc,
Expand Down
268 changes: 268 additions & 0 deletions sycl/include/CL/sycl/detail/resource_pool.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
//==------------- resource_pool.hpp - USM resource pool ---------*- C++-*---==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#pragma once

#include <CL/sycl/buffer.hpp>
#include <CL/sycl/detail/defines_elementary.hpp>

#include <cassert>
#include <memory>
#include <mutex>
#include <set>

__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
namespace detail {

// Forward declarations
class context_impl;
class queue_impl;
class device_impl;
class platform_impl;
class ResourcePool;

struct __SYCL_EXPORT ManagedResourceBase {
ManagedResourceBase() = delete;
virtual ~ManagedResourceBase();

protected:
ManagedResourceBase(size_t Size, RT::PiMem Mem, ResourcePool *Origin)
: MSize(Size), MMem(Mem), MOrigin(Origin) {}

/// Size of the memory in the managed resource.
size_t MSize;

/// Memory associated with the managed resource.
RT::PiMem MMem;

/// The resource pool the resource was taken from.
ResourcePool *MOrigin;

friend class ResourcePool;
};

template <typename T, int Dims>
struct ManagedResource : public ManagedResourceBase {
ManagedResource() = delete;

/// Gets the buffer associated with the resource.
///
/// \return the buffer associated with the resource.
buffer<T, Dims, buffer_allocator, void> &getBuffer() { return MBuffer; }

private:
/// Creates a buffer implementation.
///
/// \param Size is the size of the memory passed to the buffer.
/// \param Mem is the memory for the buffer.
/// \param ContextImplPtr is the context implementation the memory is
/// associated with.
/// \param AvailableEvent is an event tied to the availability of the data in
/// the memory.
/// \return a shared pointer to the resulting buffer implementation.
static std::shared_ptr<buffer_impl>
createBufferImpl(size_t Size, RT::PiMem Mem,
const std::shared_ptr<context_impl> &ContextImplPtr,
event AvailableEvent) {
return std::make_shared<buffer_impl>(
Mem, createSyclObjFromImpl<context>(ContextImplPtr), Size,
make_unique_ptr<detail::SYCLMemObjAllocatorHolder<buffer_allocator>>(),
AvailableEvent);
}

ManagedResource(size_t Size, RT::PiMem Mem, ResourcePool *Origin,
range<Dims> Range,
const std::shared_ptr<context_impl> &ContextImplPtr,
event AvailableEvent = event{})
: ManagedResourceBase(Size, Mem, Origin),
MBuffer(createBufferImpl(Size, Mem, ContextImplPtr, AvailableEvent),
Range, 0,
/*IsSubBuffer=*/false) {}

// Constructor for when pool is disabled.
ManagedResource(ResourcePool *Origin, range<Dims> Range, T *DataPtr)
: ManagedResourceBase(0, nullptr, Origin), MBuffer(DataPtr, Range) {}

/// Buffer owned by the resource.
buffer<T, Dims, buffer_allocator, void> MBuffer;

friend class ResourcePool;
};

class __SYCL_EXPORT ResourcePool {
private:
/// Free entry in the resource pool. This represents an allocation owned by
/// the pool that is not currently in use.
struct FreeEntry {
/// Byte size of the free entry.
size_t Size;
/// Memory allocation of the free entry.
RT::PiMem Mem;
};

/// Comparison of free entries by size. This is used for fast lookup by size
/// in the pool.
struct FreeEntryCompare {
using is_transparent = void;
bool operator()(FreeEntry const &lhs, FreeEntry const &rhs) const {
return lhs.Size < rhs.Size;
}
bool operator()(FreeEntry const &lhs, size_t rhs) const {
return lhs.Size < rhs;
}
bool operator()(size_t lhs, FreeEntry const &rhs) const {
return lhs < rhs.Size;
}
};

/// Extracts a free entry from the pool that fits the size required. If there
/// is no suitable entry, new memory will be allocated.
///
/// \param Range is the range of the resulting buffer.
/// \param QueueImplPtr is the queue with the context to allocate memory in.
/// \param DataPtr is the pointer to data on the host to initialize the
/// associated memory with. This will only be used if a new entry is
/// allocated.
/// \param IsNewEntry will be set to true if the entry was newly allocated in
/// the pool and false if it was found in the existing free entries in
/// the pool. This is not set if it is nullptr.
/// \return a shared pointer to the new managed resource.
FreeEntry getOrAllocateEntry(const size_t Size,
const std::shared_ptr<queue_impl> &QueueImplPtr,
void *DataPtr = nullptr,
bool *IsNewEntry = nullptr);

/// Extracts a free entry from the pool that fits the size required. If there
/// is no suitable entry, new memory will be allocated. The memory will be
/// initialized with the data given.
///
/// \param Size is the size of the free entry to find or allocate.
/// \param QueueImplPtr is the queue with the context to allocate memory in.
/// \param DataPtr is the pointer to data on the host to initialize the
/// associated memory with.
/// \param AvailableEvent will be set to an event that is tied to the
/// initialization of the memory.
/// \param IsNewEntry will be set to true if the entry was newly allocated in
/// the pool and false if it was found in the existing free entries in
/// the pool. This is not set if it is nullptr.
/// \return a shared pointer to the new managed resource.
FreeEntry getOrAllocateEntry(const size_t Size,
const std::shared_ptr<queue_impl> &QueueImplPtr,
void *DataPtr, event *AvailableEvent,
bool *IsNewEntry = nullptr);

/// Gets the context implementation associtated with a queue implementation.
///
/// \param QueueImplPtr is the queue implementation to get the context
/// implementation from. \return the context implementation from the queue
/// implementation.
static const std::shared_ptr<context_impl> &
getQueueContextImpl(const std::shared_ptr<queue_impl> &QueueImplPtr);

using ContextPtr = context_impl *;

public:
/// Removes and deallocates all free entries currently in the pool.
void clear();

ResourcePool();
ResourcePool(const ResourcePool &) = delete;
~ResourcePool() {
clear();
assert(MAllocCount == 0 && "Not all resources have returned to the pool.");
}

/// Returns true if the resource pool is enabled and false otherwise.
///
/// \return a boolean value specifying whether the pool is enabled.
bool isEnabled() const { return MIsPoolingEnabled; }

/// Creates a managed resource from the pool.
///
/// \param Range is the range of the resulting buffer.
/// \param QueueImplPtr is the queue with the context to allocate memory in.
/// \return a shared pointer to the new managed resource.
template <typename T, int Dims>
std::shared_ptr<ManagedResource<T, Dims>>
getOrAllocateResource(range<Dims> Range,
const std::shared_ptr<queue_impl> &QueueImplPtr) {
// If pool is disabled we return a buffer that will not return to the pool.
if (!MIsPoolingEnabled)
return std::shared_ptr<ManagedResource<T, Dims>>{
new ManagedResource<T, Dims>(this, Range, nullptr)};

// Get or allocate a free entry that fits the requirements.
FreeEntry Entry =
getOrAllocateEntry(Range.size() * sizeof(T), QueueImplPtr);
return std::shared_ptr<ManagedResource<T, Dims>>{
new ManagedResource<T, Dims>(Entry.Size, Entry.Mem, this, Range,
getQueueContextImpl(QueueImplPtr))};
}

/// Creates a managed resource from the pool and sets te data of the
/// associated memory.
///
/// \param Range is the range of the resulting buffer.
/// \param QueueImplPtr is the queue with the context to allocate memory in.
/// \param DataPtr is the pointer to data on the host to initialize the
/// resource with. This must contain at least the size of Range.
/// \return a shared pointer to the new managed resource.
template <typename T, int Dims>
std::shared_ptr<ManagedResource<T, Dims>>
getOrAllocateResource(range<Dims> Range,
const std::shared_ptr<queue_impl> &QueueImplPtr,
T *DataPtr) {
// If pool is disabled we return a buffer that will not return to the pool.
if (!MIsPoolingEnabled)
return std::shared_ptr<ManagedResource<T, Dims>>{
new ManagedResource<T, Dims>(this, Range, DataPtr)};

// Get or allocate a free entry that fits the requirements.
event AvailableEvent;
FreeEntry Entry = getOrAllocateEntry(Range.size() * sizeof(T), QueueImplPtr,
DataPtr, &AvailableEvent);
return std::shared_ptr<ManagedResource<T, Dims>>{
new ManagedResource<T, Dims>(Entry.Size, Entry.Mem, this, Range,
getQueueContextImpl(QueueImplPtr),
AvailableEvent)};
}

private:
/// Returns a resouce to the pool.
///
/// \param Size is the size of the resource.
/// \param Mem is the memory of the resource.
void returnResourceToPool(const size_t Size, RT::PiMem Mem) {
std::lock_guard<std::mutex> Lock{MMutex};
MFreeEntries.insert({Size, Mem});
}

friend struct ManagedResourceBase;

/// Is true if the pool is enabled and false otherwise. This is controlled by
/// the SYCL_DISABLE_AUXILIARY_RESOURCE_POOL config.
const bool MIsPoolingEnabled;

/// The platform associated with the pool.
std::shared_ptr<platform_impl> MPlatform;

/// Counter for allocations done by the pool that are currently alive. This
/// includes managed resources that are currently alive.
size_t MAllocCount = 0;

/// A set of all free entries in the pool.
std::multiset<FreeEntry, FreeEntryCompare> MFreeEntries;

/// Mutex protecting access to the pool.
std::mutex MMutex;
};

} // namespace detail
} // namespace sycl
} // __SYCL_INLINE_NAMESPACE(cl)
4 changes: 4 additions & 0 deletions sycl/include/CL/sycl/detail/sycl_mem_obj_t.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,10 @@ class __SYCL_EXPORT SYCLMemObjT : public SYCLMemObjI {
const size_t SizeInBytes, event AvailableEvent,
std::unique_ptr<SYCLMemObjAllocator> Allocator);

SYCLMemObjT(RT::PiMem MemObject, const context &SyclContext,
const size_t SizeInBytes, event AvailableEvent,
std::unique_ptr<SYCLMemObjAllocator> Allocator);
Comment on lines +87 to +89
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this c-tor supersede the other one? That is, should the other c-tor be marked with TODO for removal when ABI break is allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so. The other constructor is intended to be user-facing and as such has additional checks for argument validity. The new constructor is meant for internal use so we do not need the additional checks.


SYCLMemObjT(cl_mem MemObject, const context &SyclContext,
event AvailableEvent,
std::unique_ptr<SYCLMemObjAllocator> Allocator)
Expand Down
29 changes: 29 additions & 0 deletions sycl/include/CL/sycl/handler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <CL/sycl/detail/export.hpp>
#include <CL/sycl/detail/handler_proxy.hpp>
#include <CL/sycl/detail/os_util.hpp>
#include <CL/sycl/detail/resource_pool.hpp>
#include <CL/sycl/event.hpp>
#include <CL/sycl/id.hpp>
#include <CL/sycl/interop_handle.hpp>
Expand Down Expand Up @@ -476,6 +477,34 @@ class __SYCL_EXPORT handler {
/// @param ReduObj is a pointer to object that must be stored.
void addReduction(const std::shared_ptr<const void> &ReduObj);

/// Gets the resource pool for the context associated with the handler.
///
/// \return a reference to the resource pool of the underlying context.
detail::ResourcePool &getResourcePool();

/// Gets or allocates a new resource from the resource pool.
///
/// \param Range is the range of the underlying buffer for the resource.
/// \return a shared pointer to the resulting resource.
template <typename T, int Dims>
std::shared_ptr<detail::ManagedResource<T, Dims>>
getOrAllocateResourceFromPool(range<Dims> Range) {
return getResourcePool().getOrAllocateResource<T, Dims>(Range, MQueue);
}

/// Gets or allocates a new resource from the resource pool and intialize it.
///
/// \param Range is the range of the underlying buffer for the resource.
/// \param DataPtr is a pointer to the data to initialize the resource with.
/// The data pointed to must be at least the size of range in bytes.
/// \return a shared pointer to the resulting resource.
template <typename T, int Dims>
std::shared_ptr<detail::ManagedResource<T, Dims>>
getOrAllocateResourceFromPool(range<Dims> Range, T *DataPtr) {
return getResourcePool().getOrAllocateResource<T, Dims>(Range, MQueue,
DataPtr);
}

~handler() = default;

bool is_host() { return MIsHost; }
Expand Down
Loading